By Giorgia Zaetta
There are more than six hundred and seventy-five duodecillion possible chemical structures, but only a part of them, approximately 1060, are potential pharmacologically active compounds. These molecules constitute the chemical space. The accessible (and synthesizable) chemical space grows exponentially every day, subsequently increasing the chances of finding new potential hits and lead compounds.
Although structure-based methods (SBM) provide detailed information about the binding mode between a compound and its target (when the structure of the target is available), this is a highly demanding task from a computational cost perspective, moreover if we want to access to this large chemical space. Here, ligand-based methods (LBM) provide an excellent alternative as we can screen larger number of compounds than with SBM while keeping similar physicochemical properties of the query compound. A question may appear at this point: how chemically diverse are the new compounds with respect to my query?
Searching for more chemically diverse compounds
Chemical diversity can be described as the distribution of compounds in the chemical space based on their physicochemical profile. As mentioned above, in early stages of drug discovery, it’s considerably time-consuming and computationally expensive to screen the large chemical space. For this reason, in addition to an optimal method to screen compounds, libraries that contain molecules with a set of desired properties (often project-dependent), or structures can be curated for a desired purpose, commonly generating small-molecule libraries.
However, structurally similar compounds may exhibit similar ADME/Tox issues, synthetic problems, or being covered under the same IP of a patent. Therefore, ideally, it is relevant to select a proper method that will allow you to find new chemical solutions that will overcome the limitations of your current compounds. The more structurally dissimilar compounds are found, the higher the chance of finding new hits.
At present, different methods describe chemical similarity when virtual screening is performed: structural keys, fingerprints, or molecular descriptors. While 2D similarity methods provide quick searches, they don’t consider the different conformers a molecule can adopt. 3D similarity methods can capture this information, calculating different conformers per molecule and comparing them to, for example, the bioactive conformation of a co-crystalized ligand. But even 3D methods can have some limitations in the identification of novel and diverse compounds, as some physicochemical properties are highly dependent of the conformer adopted by the molecule.
Pharmacelera’s technology enables screening large chemical libraries thanks to the use of extremely efficient algorithms in combination with our precise 3D semiempirical quantum mechanics molecular field descriptors. These algorithms not only find original structures but also take into consideration the synthesizability they have.
Studying chemical diversity, and looking for hits with different and novel features, is very important. Combining these algorithms and descriptors prompts to an efficient, accurate, and fast exploration of a large chemical space while also identifying original and synthesizable compounds.