The search for new drugs with desirable properties has been likened by Robin Spencer (as cited) to the process of holing a golf ball from a great distance, in which several strokes (using different clubs) are required to get one ever closer to the target. This is because the search space of possible drugs is simply enormous. The field that deals with answering the question of this number is known as ‘enumeration’, and while not all possible molecules are likely to make drugs, thus introducing constraints, the number of ‘drug-like’ molecules containing up to 30 ‘heavy’ (non-H) atoms has been estimated at 1063.
To face this challenge, pharmaceutical companies have acquired very large libraries (1 – 2 millions) of candidate substances that might produces hits in appropriate assays (screens), and which might then be developed into leads and finally into marketable substances. While many of these library molecules are proprietary, a lot of molecular diversity is already available to those without direct access to new synthetic chemistry. Thus, Williams points out that PubChem already contains data on more than 18M molecules, ZINC lists electronically 4.8M that are commercially available (see our recent analysis of its diversity as part of a separate analysis of the comparison between commercial drugs and intermediary metabolites, related to the role of carriers in cellular drug uptake), and the internet is a seriously useful resource for acquiring such molecules. […]