AI drug discovery refers to the use of artificial intelligence techniques, including machine learning, deep learning, generative models, and natural language processing, to accelerate and improve the process of identifying, designing, and developing new therapeutic drugs. Traditional pharmaceutical research and development is among the most expensive and time-consuming endeavors in science, and AI offers the potential to reduce both the cost and timeline by orders of magnitude. As of early 2026, over 173 AI-discovered drug programs are in clinical development, with the first regulatory approvals anticipated in the 2026-2027 timeframe.
The conventional drug development pipeline is extraordinarily long and expensive. According to the Tufts Center for the Study of Drug Development, bringing a single new drug from initial discovery through regulatory approval costs an average of $2.6 billion and takes 10 to 15 years. The process involves several sequential stages, each with high failure rates:
| Stage | Duration | Description | Approximate success rate |
|---|---|---|---|
| Target identification and validation | 1-3 years | Identifying biological targets (proteins, enzymes, receptors) implicated in disease | Varies widely |
| Hit discovery and lead optimization | 2-4 years | Screening millions of compounds to find molecules that interact with the target, then optimizing their properties | ~5% of leads advance |
| Preclinical development | 1-2 years | In vitro and animal studies to assess safety, pharmacokinetics, and efficacy | ~10% proceed to human trials |
| Phase I clinical trials | 1-2 years | Safety and dosage testing in small groups of healthy volunteers | ~63% success rate |
| Phase II clinical trials | 2-3 years | Efficacy and side effect testing in patients with the target disease | ~31% success rate |
| Phase III clinical trials | 3-4 years | Large-scale efficacy and safety confirmation | ~58% success rate |
| Regulatory review | 1-2 years | FDA or equivalent agency review of all trial data | ~85-90% approval rate |
The overall probability that a drug entering Phase I trials will eventually receive FDA approval is roughly 7.9%, though this figure varies by therapeutic area. Oncology drugs, for example, have historically had lower success rates than cardiovascular or infectious disease therapies. This high attrition, combined with the long timelines involved, is the core problem that AI seeks to address.
AI technologies are being deployed at virtually every stage of the drug development process:
| Application | AI techniques used | What it does | Impact |
|---|---|---|---|
| Target identification | Graph neural networks, NLP for literature mining, multi-omics integration | Analyzes genomic, proteomic, and clinical data to identify novel drug targets associated with disease | Reduces target identification from years to months; uncovers non-obvious biological connections |
| Molecular generation (de novo design) | Variational autoencoders, generative adversarial networks, diffusion models, reinforcement learning | Designs novel molecular structures with desired properties from scratch | Generates diverse candidate molecules without relying on existing compound libraries |
| Virtual screening | Molecular docking simulations, convolutional neural networks, transformer models | Computationally evaluates large libraries of compounds against a target to predict binding affinity | Screens millions to billions of compounds in hours rather than months |
| ADMET prediction | Gradient-boosted trees, neural networks, graph-based models | Predicts absorption, distribution, metabolism, excretion, and toxicity properties of candidate molecules | Filters out likely failures early, reducing costly late-stage attrition |
| Lead optimization | Bayesian optimization, multi-objective evolutionary algorithms, active learning | Iteratively refines hit compounds to improve potency, selectivity, and drug-like properties | Reduces the number of compounds that need to be physically synthesized and tested |
| Clinical trial optimization | Predictive analytics, NLP, patient stratification models | Optimizes trial design, patient recruitment, site selection, and dosing strategies | Shortens trial timelines and reduces costs by improving patient selection and protocol design |
| Drug repurposing | Knowledge graphs, network pharmacology, transfer learning | Identifies existing approved drugs that may be effective for new indications | Bypasses early discovery stages entirely for already-approved molecules |
| Protein structure prediction | AlphaFold, ESMFold, RoseTTAFold | Predicts 3D protein structures from amino acid sequences to enable structure-based drug design | Provides structural targets for drug design even when experimental structures are unavailable |
ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties are essential indicators of whether a small molecule can become a viable drug. Many clinical trial failures are attributed to inadequate ADMET profiles, making early-stage ADMET evaluation critical for reducing downstream attrition. Recent AI frameworks such as ADMET-AI utilize advanced machine learning models to perform rapid, large-scale ADMET profiling across millions of compounds. These systems integrate absorption and toxicity endpoints to assist in early decision-making and prioritization of candidates for experimental validation. ADMET-AI has been shown to outperform existing tools in both speed and accuracy, processing over a million molecules in just a few hours while achieving top scores on benchmark datasets.
The AI drug discovery landscape includes a mix of AI-native biotech companies, established pharmaceutical firms building internal AI capabilities, and technology companies extending their AI platforms into drug discovery.
Founded in 2014 and headquartered in Hong Kong, Insilico Medicine is one of the most prominent AI-native drug discovery companies. The company operates an end-to-end AI platform spanning target discovery (PandaOmics), molecular generation (Chemistry42), and clinical trial prediction (InClinico).
Insilico Medicine made history by advancing the first AI-discovered drug candidate through clinical trials. Its lead compound, originally designated ISM001-055 and later renamed rentosertib, is a TNIK (Traf2- and Nck-interacting kinase) inhibitor being developed for idiopathic pulmonary fibrosis (IPF). The drug completed a Phase IIa trial in November 2024 with positive results, and in June 2025, Insilico published the industry's first proof-of-concept clinical validation of AI-driven drug discovery in Nature Medicine. Rentosertib remains in Phase IIa trials with U.S. studies ongoing as of early 2026.
In March 2025, Insilico Medicine secured $110 million in Series E financing to advance its AI-driven drug discovery programs.
Recursion Pharmaceuticals, headquartered in Salt Lake City, uses what it calls the Recursion Operating System (Recursion OS), a platform combining high-throughput biological experimentation with AI-driven analysis. The company generates massive phenomic datasets by imaging cells under millions of experimental conditions and uses AI to identify patterns that suggest therapeutic opportunities.
In a landmark transaction, Recursion and Exscientia merged in 2025, combining Recursion's phenomic screening capabilities with Exscientia's automated precision chemistry into a full end-to-end AI drug discovery platform. The merged entity carries forward programs from both companies:
| Program | Target/mechanism | Indication | Stage (early 2026) |
|---|---|---|---|
| REC-394 | C. difficile Toxin B inhibitor | Clostridioides difficile infection | Phase 2 (update expected Q1 2026) |
| REC-1245 | RBM39 degrader | Biomarker-enriched solid tumors and lymphoma | Phase 1/2 (dose-escalation data expected H1 2026) |
| REC-617 | Undisclosed | Undisclosed oncology | IND-enabling |
REC-1245 is particularly noteworthy: it advanced from target identification to the start of a Phase I/II trial in just 18 months with only 200 compounds synthesized, compared to the industry standard of roughly 42 months and thousands of compounds. This dramatic compression of timelines demonstrates the potential of AI-driven approaches.
Isomorphic Labs, founded in 2021 as a spin-off from Google DeepMind, is focused on applying AI to drug discovery with an emphasis on protein structure prediction and molecular design. The company builds on the foundational work of AlphaFold, which revolutionized structural biology.
In February 2026, Isomorphic Labs announced an even more powerful AI model called IsoDDE (Isomorphic Drug Discovery Engine), which scientists have compared to "an AlphaFold 4." According to the company's report, IsoDDE outperforms both Boltz-2 and physics-based methods at determining binding affinity, a critical property for drug design. The model represents a shift from purely predicting protein structures to actively designing drug molecules that bind to those structures.
Isomorphic Labs has signed landmark strategic partnerships with Eli Lilly and Novartis valued at nearly $3 billion. The company is staffing up for its first human clinical trials, with several lead candidates for oncology and immune-mediated disorders currently in the IND-enabling phase. Experts predict that the first Isomorphic-designed molecules could enter Phase I trials by late 2026.
Before its merger with Recursion, Oxford-based Exscientia was a pioneer in AI-driven drug design. In 2020, it became the first company to place an AI-designed molecule (DSP-1181, a candidate for obsessive-compulsive disorder developed in partnership with Sumitomo Pharma) into human clinical trials. Exscientia's automated precision chemistry platform uses active learning and Bayesian optimization to iteratively design, synthesize, and test compounds, dramatically reducing the number of design-make-test cycles required. Its technology and pipeline are now integrated into the Recursion platform.
London-based BenevolentAI focuses on using AI for target identification and drug repurposing. The company gained widespread attention early in the COVID-19 pandemic when its AI platform identified baricitinib, an existing JAK inhibitor, as a potential treatment for COVID-19. This prediction was subsequently validated in clinical trials, and baricitinib received emergency use authorization from the FDA for hospitalized COVID-19 patients.
After a period of financial challenges and a major restructuring in December 2024, BenevolentAI delisted from the Euronext Amsterdam stock exchange in March 2025 before merging with Osaka Holdings. The company's AI platform continues to operate, but its trajectory illustrates the commercial challenges facing AI drug discovery companies even when the underlying technology produces validated results.
Schrodinger (stylized as Schrodinger) operates a physics-based computational platform that integrates predictive modeling, data analytics, and collaboration tools for rapid exploration of chemical space. While the company has sometimes resisted the "AI" label, preferring to emphasize its physics-based foundations, its platform increasingly incorporates machine learning and AI techniques alongside molecular dynamics simulations, free energy calculations, and quantum mechanics calculations.
Schrodinger's clinical success includes zasocitinib (TAK-279), a tyrosine kinase 2 (TYK2) inhibitor originally developed in partnership with Nimbus Therapeutics and later acquired by Takeda. Zasocitinib has advanced into Phase III clinical trials, representing one of the most advanced drug candidates to emerge from a computationally driven design strategy. The company has been expanding its capabilities with physics-guided generative AI workflows for molecular design.
As of early 2026, the number of AI-discovered or AI-designed drug candidates in clinical development has grown substantially:
| Drug candidate | Company | Indication | Clinical stage | AI contribution |
|---|---|---|---|---|
| Rentosertib (ISM001-055) | Insilico Medicine | Idiopathic pulmonary fibrosis | Phase IIa | AI-identified target (TNIK) and AI-generated molecule |
| Zasocitinib (TAK-279) | Schrodinger / Nimbus / Takeda | Autoimmune diseases (TYK2 inhibitor) | Phase III | Computationally designed lead compound |
| DSP-1181 | Exscientia / Sumitomo Pharma | Obsessive-compulsive disorder | Phase I (completed) | First AI-designed molecule in human trials |
| REC-1245 | Recursion | Solid tumors and lymphoma (RBM39 degrader) | Phase I/II | AI-identified target and accelerated design |
| REC-394 | Recursion | C. difficile infection | Phase II | AI-discovered compound |
| INS018_055 (follow-on programs) | Insilico Medicine | Multiple (oncology, immunology) | Phase I - IIa | End-to-end AI platform |
| EXS21546 | Exscientia (now Recursion) | Oncology (adenosine A2A antagonist) | Phase I/II | AI-optimized design |
| BEN-2293 | BenevolentAI | Atopic dermatitis | Phase IIa | AI-identified target |
Research aggregators tracking the field estimate that over 200 AI-originated programs are in some stage of clinical development, with 15 to 20 expected to enter pivotal (Phase III) trials in 2026.
The release of AlphaFold by Google DeepMind has been one of the most transformative developments in structural biology and drug discovery.
AlphaFold 2, announced in late 2020, solved the decades-old protein structure prediction problem with unprecedented accuracy. In the CASP14 (Critical Assessment of protein Structure Prediction) competition, AlphaFold 2 achieved a median GDT (Global Distance Test) score of 92.4, dramatically outperforming all other methods. In 2022, DeepMind released predicted structures for nearly all known proteins (over 200 million structures), providing the scientific community with a resource of enormous value for understanding biology and designing drugs.
AlphaFold 3, released in May 2024, extended the model's capabilities beyond single protein structure prediction to predicting the structures and interactions of protein complexes, including protein-ligand, protein-nucleic acid, and multi-component complexes. In tests conducted by the DeepMind team, AlphaFold 3's full-atom docking accuracy reached 76.4%, which was 1.8 times that of RoseTTAFold All-Atom (42%). This capability is directly relevant to drug discovery, as understanding how a drug molecule binds to its protein target is fundamental to rational drug design.
In February 2026, Isomorphic Labs announced IsoDDE (Isomorphic Drug Discovery Engine), which scientists have described as comparable to "an AlphaFold 4." IsoDDE goes beyond structure prediction to actively model binding affinity, selectivity, and other properties critical for drug design. According to Isomorphic Labs, IsoDDE outperforms both Boltz-2 and physics-based methods in binding affinity prediction, marking a transition from AI as a tool for understanding biology to AI as a tool for directly designing therapeutics.
The progression from AlphaFold 2 through AlphaFold 3 to IsoDDE represents a clear trajectory: from predicting what proteins look like, to predicting how molecules interact with proteins, to designing molecules optimized for therapeutic use.
One of the most active areas of AI drug discovery research involves generative models for molecular design, particularly diffusion models adapted from the image generation domain.
Diffusion models for molecular generation work by learning to reverse a noise-addition process. During training, the model learns to progressively denoise a random distribution of atoms back into valid molecular structures. By conditioning this process on target properties (binding affinity to a specific protein pocket, desired solubility, low toxicity), the model can generate novel molecules tailored to specific requirements.
Several notable generative chemistry systems have emerged:
| System | Developer | Approach | Key capability |
|---|---|---|---|
| DiffMC-Gen | Academic (2025) | Dual-diffusion model for 2D and 3D molecular generation | Simultaneously optimizes multiple objectives across drug design |
| DiffGui | Academic (2025) | Target-conditioned E(3)-equivariant diffusion | Generates realistic 3D molecules that fit specific protein pockets |
| G2D-Diff | Academic (2025) | Genotype-to-drug diffusion | Creates drug structures tailored to specific cancer genotypes |
| DrugDiff | Academic (2025) | Latent diffusion with predictor guidance | Generates compounds with specified molecular properties |
| IsoDDE | Isomorphic Labs (2026) | Proprietary drug discovery engine | End-to-end molecule design with binding affinity optimization |
A key challenge for generative molecular models remains ensuring chemical synthesizability. A model may design a molecule with ideal predicted properties, but if the molecule cannot be practically synthesized in a laboratory, it has limited value. Current research is focused on incorporating synthetic feasibility constraints directly into the generative process.
Researchers have identified the development of unified "foundation models" for molecular science as a key frontier. Such models could seamlessly design small molecules, peptides, and complex hybrid therapeutics from a single architecture, rather than requiring separate specialized models for each molecular class.
In late February 2026, Eli Lilly inaugurated LillyPod, described as the most powerful AI factory wholly owned and operated by a pharmaceutical company. LillyPod is the world's first NVIDIA DGX SuperPOD built with DGX B300 systems.
LillyPod is powered by 1,016 NVIDIA Blackwell Ultra GPUs, delivering more than 9,000 petaflops of AI performance. The system was assembled in just four months and was inaugurated at a ribbon-cutting ceremony in Indianapolis.
LillyPod is designed to support drug discovery, genomics, clinical development, and manufacturing optimization. The system creates what Lilly calls a "computational dry lab at massive scale," where scientists can simulate and evaluate billions of molecular hypotheses in parallel before committing to physical experiments. Lilly employees can also use LillyPod to build chatbots, agentic workflows, and research lab agents using the company's internal AI platforms.
LillyPod's launch reflects a broader trend of major pharmaceutical companies investing heavily in dedicated AI computing infrastructure rather than relying solely on cloud-based resources. This investment signals a strategic bet that computational scale will be a key competitive advantage in drug discovery.
AI model performance is heavily dependent on high-quality, well-curated datasets. Despite the vast scale of existing chemical libraries, rigorously curated datasets with robust biological, pharmacological, and clinical annotations remain scarce. Publicly available databases such as ChEMBL, PubChem, and BindingDB contain millions of data points, but these data are often noisy, inconsistent, or biased toward certain target classes. Proprietary pharmaceutical data, while potentially higher quality, is rarely shared between companies due to competitive concerns.
Predictions made by AI models must ultimately be validated through wet-lab experiments and clinical trials. There is sometimes a disconnect between computational results (virtual screening hits, predicted binding affinities, generated molecules) and real-world experimental outcomes. Bridging this gap requires tight integration between computational and experimental teams, along with realistic expectations about what AI can and cannot predict.
Regulatory agencies worldwide are still developing frameworks for evaluating AI-discovered drugs. While the FDA has indicated that AI-discovered drugs will be evaluated by the same safety and efficacy standards as traditionally discovered drugs, questions remain about how much detail about the AI methods used must be disclosed, how AI-derived evidence will be weighted, and how regulatory bodies will handle the rapid pace of AI model evolution. The FDA's Center for Drug Evaluation and Research (CDER) has been engaging with AI drug discovery companies, and an FDA framework specifically addressing AI in drug development is expected to evolve throughout 2026.
Many AI drug discovery publications report impressive performance on benchmark datasets, but reproducibility in real-world drug discovery settings has been harder to demonstrate. The field has been criticized for an overreliance on retrospective analyses and in silico benchmarks rather than prospective experimental validation. Calls for greater transparency in model architectures, training data, and evaluation protocols are growing louder as more AI-discovered compounds enter clinical trials.
The path to commercial viability for AI drug discovery companies has proven challenging. BenevolentAI's delisting and restructuring, despite producing validated clinical results, illustrates the gap between scientific achievement and commercial sustainability. Drug discovery remains a high-risk, long-timeline business, and AI has not yet fully closed the gap between promising preclinical results and approved products generating revenue.
The AI in drug discovery market is growing rapidly:
| Metric | Value | Source |
|---|---|---|
| Global market size (2025) | $2.35 billion - $2.9 billion | Grand View Research, MarketsandMarkets |
| Projected market size (2026) | $4.63 billion - $5.1 billion | Precedence Research, MarketsandMarkets |
| Projected market size (2033) | $13.77 billion | Grand View Research |
| CAGR (2026-2033) | 24.8% | Grand View Research |
| AI-originated programs in clinical trials (2026) | 173+ (some estimates exceed 200) | Axis Intelligence |
| Number expected to enter pivotal trials in 2026 | 15-20 | Axis Intelligence |
The preclinical stage remains the most active application area, accounting for 39.3% of AI drug discovery studies. Oncology is the most prominent therapeutic area leveraging AI tools, followed by immunology, neurology, and infectious disease.
As of early 2026, the AI drug discovery field is at an inflection point. Several key trends and developments define the current landscape:
Clinical validation milestones. Insilico Medicine's publication of the first proof-of-concept clinical validation of AI-driven drug discovery in Nature Medicine (June 2025) represented a watershed moment. With rentosertib showing positive Phase IIa results in IPF and zasocitinib (Schrodinger/Takeda) advancing to Phase III, the field is beginning to demonstrate that AI-discovered drugs can succeed in the clinic.
Industry consolidation. The Recursion-Exscientia merger signals a trend toward consolidation in the AI drug discovery space, as companies seek to build end-to-end platforms rather than focusing on narrow AI applications. This mirrors broader patterns in the pharmaceutical industry, where scale and integration are competitive advantages.
Pharma investment in AI infrastructure. Eli Lilly's launch of LillyPod and Isomorphic Labs' nearly $3 billion in partnerships with Lilly and Novartis demonstrate that major pharmaceutical companies view AI as a strategic priority, not merely an experimental tool. These investments in dedicated computing infrastructure and long-term AI partnerships reflect confidence that AI will fundamentally reshape drug discovery.
Next-generation AI models. The progression from AlphaFold 2 to AlphaFold 3 to IsoDDE represents a rapid evolution in AI capabilities for drug discovery. Generative chemistry models, particularly diffusion-based approaches, are advancing quickly, and the development of molecular "foundation models" capable of designing diverse molecular types from a single architecture is a major research frontier.
Persistent challenges. Data quality, experimental validation, regulatory frameworks, and commercial viability remain significant obstacles. The AI drug discovery field has learned that while AI can dramatically accelerate certain stages of discovery, drug development as a whole involves challenges (clinical trial logistics, manufacturing, regulatory navigation, commercial strategy) that AI alone cannot solve. The most successful companies are those that integrate AI tightly with experimental science and clinical development rather than treating AI as a standalone solution.
The consensus in the field is that the first AI-discovered drugs will receive regulatory approval in the 2026-2027 timeframe, marking the beginning of a new era in pharmaceutical development.