By Prashant Sharma

Traditionally, it requires a lot of time and deep scientific expertise to discover and synthesize a small molecule that becomes a preclinical candidate. Only 12% of drug candidates entering clinical trials are approved for use in therapy. With an average time of ten years for new drug discovery and an investment of approximately USD 2.6 billion per new molecule brought to the market, the industry is exploring new avenues to cut down time and costs. A blockbuster drug takes approximately 12 years and $4 billion-$11 billion of investment ⁽¹⁾.

The emergence of machine learning (ML) and artificial intelligence (AI) can offer researchers guidance on processing, analyzing, and understanding the data and its extensive application. Utilizing AI-ML in drug design represents an advanced approach that can help reduce the timeframe for identifying targets and developing new drugs. The researchers aim to integrate ML with small-molecule drug discovery and continue making groundbreaking strides in the continuum of personalized healthcare.

The traditional small molecule discovery process uses manual testing and assays before adopting high-throughput screening. Computational methods and virtual screening were introduced to speed up the process, followed by today's increasingly sophisticated AI and ML techniques.

The application of AI in small molecule drug discovery helps us increase speed, lower costs, improve success rates, and boost innovation. Furthermore, AI model algorithms can help tackle large-scale datasets, empower researchers to predict molecular interactions, refine drug candidates, and enhance the overall drug development process. AI-ML helps medicinal chemists speed up their tasks and notably shorten the below phases:

· Protein structure prediction and understanding of structure-activity relationships.

· ADME and toxicity predictions.

· Accelerate synthesis planning for novel compounds.

· Improve compound screening.

Pharmaceutical companies must upskill the chemists and adopt a clear AI adoption strategy. The steps to do so are:

· Study the impact of AI on medicinal and synthetic chemistry.

· Build the training plan internally for the chemists to acquire AI skills.

· Collaboration and forming internal teams to speed up the learning process.

· Follow the Investment with the greatest impact on research.

Designing new molecules likely to interact with a target, synthesizing, and then testing those molecules to identify the most promising candidates is time and resource-intensive. AI has significant potential to accelerate the design-make-test-analyze (DMTA) cycle and reduce the number of iterations ⁽²⁾. At each stage of DMTA, AI can be used to:

· Design: Protein structure prediction, de-novo library design, virtual screening, synthetic accessibility, and molecular property prediction.

· Make: Plan synthesis of new molecules, predict their yield and purity, and identify problems with the synthesis process.

· Test: Screen new molecules for the ability to interact with target proteins, predict efficacy and toxicity, and identify the most promising drug candidates.

· Analyze: Process large volumes of test data to identify correlations and trends and design further experiments to test the most promising drug candidates.

Furthermore, AI helps significantly in lead identification and optimization by increasing the time and efficiency of the most expensive and time-consuming phases of preclinical drug discovery:

· Hit identification: 30 to 50 percent acceleration in small molecule high-throughput screening, using approaches such as molecular property prediction in an iterative screening loop (versus the existing approach of randomized selection of compounds). ⁽²⁾

· Lead optimization: more than double improvement over baseline on the key metric of "efficacy observed," over 100 times the number of in silico experiments possible compared with previous screening, and faster design of compounds for optimization of drug delivery efficacy in lead optimization. ⁽²⁾

Figure 1: How AstraZeneca applies AI to accelerate the DMTA cycle – including synthesis planning, condition prediction, and molecular ideation.

View the webinar on AI used by AstraZeneca for Reaction Prediction: https://webinars.elsevier.com/elsevier/Webinar-2-Drug-Discovery-with-AI-at-AstraZeneca-from-Generative-Models-to-Reaction-Prediction.

Over the last 10 years, key AI models have emerged for faster small-molecule discovery. These models predict the three-dimensional protein structure to understand the active site and optimize the compound design to modulate desired interactions. Gen AI models are used to create virtual compound libraries and screen novel chemical compounds to create a virtual compound library. Using DL, models forecast a molecule's properties based on its structure, which is crucial for drug discovery as structure determines its interactions within a person.

The statistical QSAR model is based on training data that pairs chemical structures and biological activities. QSAR is used to predict the biological activity of a chemical compound from its structure, including toxicity, drug efficacy, and ADME properties. There are two types of QSAR models: linear QSAR models assume a molecule's biological activity is always linearly related to its chemical structure. In contrast, nonlinear QSAR models allow the nonlinear relationship between biological activity and chemical structure.

Figure 2: AI Models in Small Molecule Discovery

Quantitative structure-property relationship (QSPR) uses machine learning to relate molecular structures to compound properties and speed up the DMTA cycle. ML algorithms find structural or chemical patterns that correlate with specific compound properties, such as activity against the target of interest, reactivity, solubility, and adsorption. Synthetic accessibility models based on ML and DL score the ease of synthesis of compounds, allowing chemists to narrow down libraries to sets of synthetically accessible compounds. Computer-aided synthesis prediction (CASP) saves time, improves accuracy, and helps medicinal and synthetic chemists control costs by reducing synthesis failures and validating proof of concept at the earliest stage.

Training AI algorithms for drug discovery requires a significant amount of data. Large datasets containing chemical structures and information about their biological activity are crucial for building effective models.

Public databases are one resource for this data, but commercially available options like GOSTAR^® by Excelra offer additional features. These features can include a wider range of chemical structures, curated data sets focused on specific diseases, and tools specifically designed for drug discovery.

Despite the traditional slowness and expensive nature of small-molecule drug discovery, AI is revolutionizing the field by analyzing large data sets for patterns. Faster development timelines, reduced costs, improved by better drug candidates, and success rates. By embracing AI models and upskilling experts, pharmaceutical companies can open a new era of innovation in drug discovery.

References

(1) https://www.excelra.com/blogs/data-in-healthcare-how-far-we-have-come/

(2) https://www.elsevier.com/en-in/industry/ai-in-small-molecule-drug-discovery

(3) https://www.excelra.com/databases/gostar/ai-ml/

Pharmaceutical Microbiology Resources (http://www.pharmamicroresources.com/)