AI for science refers to the application of artificial intelligence techniques, particularly deep learning and large language models, to accelerate scientific discovery across disciplines including biology, chemistry, physics, materials science, mathematics, and earth sciences. The field gained landmark recognition in 2024 when both the Nobel Prize in Chemistry and the Nobel Prize in Physics were awarded to AI researchers, marking the first time that AI-driven work received the Nobel Prize [1][2].
AI for science encompasses a broad range of applications: predicting protein structures, discovering new materials, forecasting weather, generating mathematical proofs, designing drugs, optimizing fusion energy experiments, and analyzing genomic data. What unites these applications is the use of AI not merely as a tool for automation but as a method for generating new scientific knowledge, identifying patterns in data that humans cannot perceive, proposing hypotheses, and even making discoveries that advance fundamental understanding of the natural world.
The 2024 Nobel Prizes marked a watershed moment for AI in science, with both the Chemistry and Physics prizes awarded to researchers whose work centered on artificial intelligence.
The 2024 Nobel Prize in Chemistry was awarded in two halves. One half went to David Baker at the University of Washington "for computational protein design." The other half was awarded jointly to Demis Hassabis and John Jumper at Google DeepMind "for protein structure prediction" [1].
Hassabis and Jumper developed AlphaFold, an AI system that solved the protein structure prediction problem, a challenge that had confounded scientists for over 50 years. AlphaFold2, presented in 2020, demonstrated the ability to predict the three-dimensional structure of proteins from their amino acid sequences with accuracy comparable to experimental methods. The AlphaFold Protein Structure Database, made freely available, has been used by more than 2 million researchers across 190 countries [1].
Baker's work on computational protein design took the complementary approach: rather than predicting the structure of existing proteins, Baker's group designed entirely new proteins with specified structures and functions, opening the door to custom-designed enzymes, therapeutics, and biomaterials.
The 2024 Nobel Prize in Physics was awarded jointly to John J. Hopfield and Geoffrey Hinton "for foundational discoveries and inventions that enable machine learning with artificial neural networks" [2].
Hopfield invented the Hopfield network in 1982, an associative memory system inspired by the physics of magnetic materials that could store and retrieve patterns. Hinton, building on Hopfield's work, developed the Boltzmann machine in the 1980s, a stochastic neural network that could learn to recognize patterns in data using methods drawn from statistical physics. These foundational contributions laid the groundwork for the deep learning revolution that began around 2010 and ultimately led to systems like ChatGPT and AlphaFold [2].
The awarding of both prizes to AI-related work in the same year was unprecedented and signaled the scientific establishment's recognition that AI had become a transformative force in research.
AI has produced significant results across multiple scientific domains. The following table summarizes major achievements.
| Domain | Achievement | System/Method | Institution | Year |
|---|---|---|---|---|
| Protein structure | Predicted 3D structures of virtually all 200 million known proteins | AlphaFold 2 | Google DeepMind | 2020-2022 |
| Protein design | Designed novel proteins with specified structures and functions | RoseTTAFold, Rosetta | David Baker Lab, UW | 2021-present |
| Weather prediction | Medium-range forecasts more accurate than traditional models, 15-day probabilistic forecasting | GraphCast, GenCast | Google DeepMind | 2023-2024 |
| Materials discovery | Identified 2.2 million new stable crystal structures, equivalent to ~800 years of conventional discovery | GNoME | Google DeepMind | 2023 |
| Mathematics | Solved 4 of 6 IMO problems, achieving silver-medal performance | AlphaProof, AlphaGeometry 2 | Google DeepMind | 2024 |
| Mathematics | Discovered novel solutions via program search in function space | FunSearch | Google DeepMind | 2024 |
| Drug discovery | First fully AI-discovered drug (target and molecule) reaching Phase IIa trials | Insilico Medicine (rentosertib) | Insilico Medicine | 2024-2025 |
| Fusion energy | Controlled tokamak plasma configurations using deep reinforcement learning | RL-based plasma control | DeepMind + EPFL | 2022-present |
| Genomics | Predicted effects of genetic variants on protein function and disease | AlphaMissense | Google DeepMind | 2023 |
| Antibody design | Zero-shot de novo antibody design with 16-20% experimental hit rates | Diffusion-based models | Multiple groups | 2025 |
The prediction of protein structures from amino acid sequences had been one of biology's grand challenges since the 1970s. Experimental methods like X-ray crystallography and cryo-electron microscopy could determine structures but were slow and expensive, sometimes requiring years per protein.
AlphaFold2, presented at the CASP14 competition in December 2020, achieved a median Global Distance Test (GDT) score of 92.4 out of 100, accuracy that matched experimental methods for most proteins. The system uses a novel architecture that combines attention mechanisms with an iterative recycling process, learning relationships between amino acid sequences and their spatial arrangements [1].
In 2022, DeepMind released the AlphaFold Protein Structure Database containing predicted structures for over 200 million proteins, covering nearly every known protein sequence. This freely accessible resource has been described as equivalent to providing the biological research community with a complete "parts list" for the molecular machinery of life [1].
AlphaFold3, released in 2024, extended the system's capabilities beyond individual proteins to predict the structures of protein complexes, including interactions with DNA, RNA, and small molecules, addressing the broader challenge of understanding how biological molecules interact [3].
RoseTTAFold, developed by Minkyung Baek and David Baker at the University of Washington and published in Science in 2021, provided an alternative approach using a "three-track" neural network that simultaneously processes sequence, distance, and coordinate information. RoseTTAFold can compute protein structures in as little as ten minutes on a single gaming computer and was made freely available as open-source software [4].
Traditional numerical weather prediction relies on solving complex differential equations describing atmospheric physics, a process that requires enormous computational resources and hours of supercomputer time. AI-based approaches have demonstrated that weather forecasting can be dramatically faster and, in many cases, more accurate.
GraphCast, published in Science in 2023, uses graph neural networks to make medium-range weather forecasts (up to 10 days) more accurately than the European Centre for Medium-Range Weather Forecasts' (ECMWF) HRES system, the industry gold standard. GraphCast generates a 10-day forecast in under a minute on a single TPU, compared to hours on a supercomputer for traditional methods [5].
GenCast, published in Nature in December 2024, advanced the approach further by using a diffusion model adapted to the spherical geometry of Earth. GenCast generates probabilistic forecasts (ensembles of possible future weather scenarios) rather than single deterministic predictions, providing better uncertainty quantification. It outperformed the ECMWF's ENS ensemble system on 97.2% of test targets, and on 99.8% of targets at lead times greater than 36 hours [6].
In 2025, DeepMind introduced WeatherNext 2, capable of generating hundreds of physically realistic weather scenarios in under a minute on a single Tensor Processing Unit [5].
In November 2023, Google DeepMind announced GNoME (Graph Networks for Materials Exploration), an AI system that discovered 2.2 million new stable crystal structures, a figure the team described as equivalent to approximately 800 years of conventional materials discovery. Of these, 380,000 were identified as particularly promising for experimental synthesis [7].
GNoME uses graph neural networks to predict the stability of hypothetical crystal structures, dramatically accelerating the process of identifying materials with useful properties. The discoveries include potential new superconductors, materials for next-generation batteries, and catalysts for industrial processes. External laboratories subsequently synthesized over 700 of the predicted materials, confirming GNoME's predictions experimentally [7].
AlphaProof, a system combining a pre-trained language model with the AlphaZero reinforcement learning algorithm, achieved silver-medal performance at the 2024 International Mathematical Olympiad (IMO). The system solved four out of six problems, including the competition's most difficult problem (Problem 6, in number theory). AlphaProof generates formal proofs in the Lean proof language, allowing its solutions to be verified automatically. The methodology was published in Nature in November 2025 [8].
AlphaGeometry 2, working alongside AlphaProof, handled the geometry problems. Together, the two systems achieved a score of 28 out of 42 points, comparable to a silver medalist.
FunSearch, also from Google DeepMind, took a different approach to mathematical discovery. Rather than proving existing conjectures, FunSearch searches in the space of computer programs (written in Python) to discover novel mathematical constructions. It made genuine new contributions to extremal combinatorics, discovering solutions to the cap set problem that exceeded previously known results [9].
AI is being applied across the drug discovery pipeline, from target identification to molecule design to clinical trial optimization.
The most significant milestone came in 2024-2025 when rentosertib (formerly ISM001-055), a drug whose both target and molecule were designed entirely by AI, completed Phase IIa clinical trials for idiopathic pulmonary fibrosis. The drug reached preclinical candidate nomination in just 18 months, a process that traditionally requires three to four years. Phase IIa results published in Nature Medicine in June 2025 showed dose-dependent improvement in lung function (98 mL improvement in forced vital capacity) [10].
However, the overall impact of AI on drug discovery success rates remains debated. A 2024 industry analysis found that AI-assisted drug candidates achieved Phase I success rates of nearly 90%, compared with industry averages of 40-65%. But their progression rates through later clinical development stages remained similar to traditionally discovered compounds, suggesting AI's primary benefit may lie in accelerating the preclinical phase rather than fundamentally improving the odds of clinical success [10].
In December 2025, the FDA qualified its first AI-based tool for use in drug development clinical trials, a cloud-based platform for scoring liver biopsies in NASH/MASH trials, representing formal regulatory acceptance of AI in the drug development process [10].
Controlling the superheated plasma inside a tokamak fusion reactor is one of the most complex control problems in engineering. Google DeepMind, working with the Swiss Plasma Center at EPFL, demonstrated in 2022 that deep reinforcement learning could control the magnetic coils of a tokamak to produce and maintain a variety of plasma configurations, including shapes that had never been produced before. The system learned control policies in simulation and successfully transferred them to the real Tokamak a Configuration Variable (TCV) reactor [11].
In 2024, DeepMind announced a partnership with Commonwealth Fusion Systems (CFS) to apply AI to the design and control of SPARC, a compact tokamak aiming to achieve net energy gain. The collaboration focuses on three areas: building fast, accurate plasma simulations (using TORAX, a JAX-based simulator released as open source); finding optimal paths to maximizing fusion energy; and using reinforcement learning for real-time plasma control [11].
Additionally, researchers at the DIII-D National Fusion Facility tested AI-based methods for preventing tearing instabilities in fusion plasma, using deep reinforcement learning for real-time monitoring and responsive adjustment of magnetic confinement fields [11].
AI is not merely accelerating existing scientific workflows; it is changing how science is done at a fundamental level.
Traditionally, scientific hypotheses originate from human intuition informed by domain knowledge and literature review. AI systems can now scan vast bodies of scientific literature, identify patterns across disparate fields, and propose hypotheses that human researchers might not consider. Language models trained on scientific text can suggest connections between phenomena studied in separate disciplines, potentially accelerating interdisciplinary discovery.
AI can optimize experimental designs by predicting which experiments are most likely to yield informative results. In materials science, for example, AI systems can prioritize which of millions of candidate materials should be synthesized first, dramatically reducing the time and cost of experimental campaigns. Bayesian optimization and active learning techniques allow AI to design sequential experiments that maximize information gain [7].
Modern scientific instruments generate data at rates that exceed human analytical capacity. Particle physics experiments at the Large Hadron Collider, genomic sequencing facilities, and astronomical surveys all produce petabytes of data that require automated analysis. AI systems can identify signals in noisy data, classify objects, and detect anomalies that would be invisible to human inspection.
Physics-based simulations are essential to many scientific fields but are often computationally expensive. AI can be trained to approximate the outputs of these simulations (a technique called surrogate modeling), enabling scientists to explore parameter spaces orders of magnitude faster. DeepMind's TORAX plasma simulator, for example, uses AI-based components to accelerate tokamak simulations while maintaining physical accuracy [11].
Rather than analyzing what exists, AI enables scientists to specify desired properties and work backward to find or design systems that exhibit those properties. This "inverse design" paradigm is transforming fields like materials science (designing materials with target properties), drug discovery (designing molecules that bind specified targets), and protein engineering (designing proteins with desired functions) [7].
Several major AI tools and platforms have been developed specifically for scientific applications.
| Tool/Platform | Domain | Developer | Access | Key capability |
|---|---|---|---|---|
| AlphaFold / AlphaFold Server | Protein structure | Google DeepMind | Free (academic) | Predicts 3D protein structures from sequences |
| RoseTTAFold | Protein structure | Baker Lab, UW | Open source | Three-track structure prediction; fast on consumer hardware |
| AlphaFold3 | Biomolecular complexes | Google DeepMind | Research access | Predicts protein-DNA, protein-RNA, and protein-ligand complexes |
| GraphCast / GenCast | Weather forecasting | Google DeepMind | Open source (GraphCast) | Medium-range weather prediction |
| GNoME | Materials science | Google DeepMind | Data released publicly | Predicts stability of crystal structures |
| TORAX | Fusion energy | Google DeepMind | Open source | Differentiable plasma simulation |
| AlphaProof | Mathematics | Google DeepMind | Research only | Formal mathematical theorem proving |
| AlphaMissense | Genomics | Google DeepMind | Free | Classifies genetic missense variants |
Scientific reproducibility requires that results can be independently verified. AI-based scientific results face reproducibility challenges at multiple levels: the training data may not be fully available, model architectures may be proprietary, random seeds affect results, and computational requirements may be prohibitive for independent replication. The scientific community is developing standards for reporting AI-based results, but practices remain inconsistent [12].
Many AI systems, particularly deep neural networks, operate as "black boxes" whose internal reasoning is difficult to interpret. In science, understanding why a prediction is correct is often as important as the prediction itself. A model that accurately predicts protein structures but provides no insight into the underlying physics offers less scientific value than one whose predictions can be mapped to physical principles. Explainable AI techniques are being applied to scientific AI, but interpretability remains a significant challenge, especially for the most complex models [12].
AI models are only as good as the data they are trained on. In scientific applications, training data may contain systematic biases (for example, the Protein Data Bank overrepresents certain types of proteins that are easier to crystallize), errors, or gaps. Models trained on biased data may produce predictions that are accurate for well-represented categories but unreliable for underrepresented ones. Careful curation of training data and validation against diverse test sets are essential [12].
Effective AI for science requires deep integration between AI researchers and domain scientists. AI practitioners who lack domain knowledge may build models that optimize the wrong objectives or that miss important physical constraints. Conversely, domain scientists who lack AI expertise may misapply tools or misinterpret results. Building interdisciplinary teams and training researchers who span both AI and specific scientific domains is critical but difficult.
Training and running large AI models requires substantial computational resources. AlphaFold2's training required thousands of GPU hours, and systems like AlphaProof were trained on millions of mathematical problems over weeks. The computational cost of AI for science creates equity concerns: well-funded institutions and wealthy nations can leverage AI for science, while others may be left behind. Open-source models and publicly available predictions (like the AlphaFold Protein Structure Database) partially address this concern but do not eliminate it.
Scientific claims require rigorous validation. AI predictions, no matter how compelling, must be verified experimentally before they can be accepted as scientific knowledge. The GNoME materials discovery project, for example, identified 2.2 million candidate materials, but their scientific value depends on experimental confirmation. As of 2024, over 700 of the predicted materials had been synthesized and confirmed, a meaningful validation but still a small fraction of the total predictions [7].
AI for science has attracted significant institutional investment. Google DeepMind has been the most prolific contributor, producing AlphaFold, GraphCast, GenCast, GNoME, AlphaProof, and other scientific AI systems. In 2025, DeepMind and Google.org announced an AI for Math Initiative to support the development of AI tools for mathematical research [8].
National funding agencies have also invested heavily. The U.S. National Science Foundation (NSF) has funded numerous AI for science initiatives, and the Department of Energy operates national laboratories that integrate AI into physics, materials science, and energy research. The UK's Engineering and Physical Sciences Research Council (EPSRC) and similar agencies in the EU, Japan, and China fund dedicated AI for science programs.
Universities have established interdisciplinary AI for science centers, including the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship program, which placed AI researchers in science departments at multiple universities to foster cross-disciplinary collaboration.
As of early 2026, AI for science is in a period of rapid expansion following the transformative recognition of the 2024 Nobel Prizes. Several trends characterize the current landscape.
The success of AlphaFold has established a template: identify a hard scientific prediction problem, frame it as a machine learning task, train a large model on available data, and achieve results that rival or exceed experimental methods. This template is being applied to an expanding set of problems, from predicting the properties of quantum materials to simulating cellular processes to modeling climate dynamics.
However, the field is also confronting the limits of the current approach. Many scientific problems lack the large, well-curated datasets that enabled AlphaFold's success. Protein structures had decades of accumulated experimental data in the Protein Data Bank; other domains may not have equivalent resources. Generating high-quality training data for scientific AI remains a bottleneck.
The role of AI in science is also evolving from prediction toward generation. Rather than just predicting properties of existing systems, AI is increasingly used to design new ones: new proteins, new materials, new drug candidates, new experimental protocols. This generative capability represents a qualitative shift in how AI contributes to science, moving from analysis to creation.
Finally, questions about the long-term relationship between AI and scientific understanding remain open. AlphaFold can predict protein structures with high accuracy, but it does not fully explain the physical principles that determine protein folding. AI may accelerate discovery while leaving the deeper task of understanding to future work. Whether AI will eventually contribute to fundamental scientific understanding, not just empirical prediction, is one of the most profound open questions in the field.