AlphaFold

Artificial intelligence Google Machine learning

17 min read

Updated Apr 26, 2026

AlphaFold is an artificial intelligence system developed by Google DeepMind that predicts the three-dimensional structure of proteins from their amino acid sequences. Since its debut at the CASP13 competition in 2018, AlphaFold has transformed structural biology, making it possible to predict protein shapes with near-experimental accuracy. The system earned its creators, Demis Hassabis and John Jumper, the 2024 Nobel Prize in Chemistry, and its predictions have been used by over two million researchers worldwide.

Background

Proteins are molecular machines that carry out virtually every function in living cells, from catalyzing chemical reactions to transporting molecules across membranes. A protein's function is determined by its three-dimensional shape, which in turn is dictated by the sequence of amino acids in the protein chain. Determining these shapes experimentally, through techniques such as X-ray crystallography, cryo-electron microscopy, and nuclear magnetic resonance spectroscopy, is slow and expensive. A single structure determination can take months or years and cost hundreds of thousands of dollars.

The gap between known protein sequences and experimentally determined structures has widened steadily. By the early 2020s, roughly 200 million protein sequences had been cataloged in the UniProt database, but the Protein Data Bank (PDB) contained experimentally solved structures for fewer than 200,000 of them. This disparity made computational protein folding prediction one of the most important open problems in biology.

AlphaFold 1 (CASP13, 2018)

DeepMind entered the protein structure prediction field in 2018 with its first AlphaFold system. The team competed in the 13th Critical Assessment of Techniques for Protein Structure Prediction (CASP13), a biennial blind competition that had been running since 1994 and served as the primary benchmark for the field.

Architecture

AlphaFold 1 used a deep learning approach based on a deep residual neural network (ResNet). Rather than predicting binary contact maps (whether two amino acid residues are close together or not), the system predicted full probability distributions over distances between every pair of residues. This distance prediction approach provided substantially more information than contact maps alone.

The system took as input a multiple sequence alignment (MSA), which compiles related protein sequences found in genetic databases. From the MSA, the neural network extracted co-evolutionary patterns, meaning pairs of positions in the sequence that tend to mutate together, suggesting spatial proximity in the folded structure. The predicted distance distributions were then used to construct a potential of mean force, which could be optimized through gradient descent to generate candidate 3D structures.

CASP13 results

AlphaFold 1 placed first in the overall rankings at CASP13 in December 2018. It was particularly effective on the hardest category of targets, known as free modeling (FM) targets, where no homologous template structures existed. AlphaFold produced high-accuracy structures (with template modeling scores of 0.7 or higher) for 24 out of 43 free modeling domains, while the next best method achieved this level of accuracy for only 14 out of 43 domains. The margin of victory was the largest improvement in a single CASP cycle in the competition's history at that time.

Metric	AlphaFold 1	Next best method
FM domains with TM-score >= 0.7	24 / 43	14 / 43
Overall ranking	1st	2nd
Year	2018	2018

AlphaFold 2 (CASP14, 2020)

AlphaFold 2, presented at CASP14 in November-December 2020, represented a fundamental redesign of the system. It achieved near-experimental accuracy on most targets and is widely considered the moment the protein structure prediction problem was effectively solved for single-chain proteins.

Architecture

AlphaFold 2 replaced the ResNet architecture of the first version with a novel transformer-based design built around two main components: the Evoformer module and the Structure module.

Evoformer module. The Evoformer is the core of AlphaFold 2. It operates on two data representations simultaneously:

The MSA representation, which encodes information from the multiple sequence alignment.
The pair representation, which encodes the relationships between every pair of amino acid residues.

The Evoformer consists of 48 blocks (with unshared weights), each of which updates both representations through a series of operations. The MSA representation is processed using axial self-attention, where attention is applied along the rows and columns of the alignment separately. The pair representation is processed using triangular updates and triangular self-attention, operations designed to enforce geometric consistency: if residue A is close to residue B and residue B is close to residue C, then the representation should reflect information about the A-C relationship as well. The two representations also exchange information, with the pair representation conditioning the MSA attention and the MSA representation feeding back into the pair representation.

Structure module. The Structure module takes the final outputs from the Evoformer and converts them into explicit 3D atomic coordinates. It first generates a protein backbone by predicting a rigid-body transformation (rotation and translation) for each residue, then places side-chain atoms using predicted torsion angles. The Structure module contains 8 blocks with shared weights, operating in an iterative fashion.

Recycling. AlphaFold 2 uses an iterative process called recycling, where the predicted MSA representation, pair representation, and 3D coordinates are fed back into the network for additional rounds of refinement. In practice, three recycling iterations are used during inference, with each pass improving the accuracy of the predicted structure.

Component	Function	Blocks	Weight sharing
Input embedding	Generates initial MSA and pair representations from sequence and MSA	1	N/A
Evoformer	Refines MSA and pair representations through attention and triangular updates	48	No
Structure module	Converts representations to 3D atomic coordinates	8	Yes
Recycling	Feeds output back into the network for iterative refinement	3 iterations	Reuses full network

CASP14 results

At CASP14, AlphaFold 2 achieved a median Global Distance Test (GDT) score of 92.4 across all targets, a score that approached the level of experimental uncertainty in many structural determination methods. A GDT score of 90 or above is generally considered competitive with experimentally determined structures. Approximately two-thirds of the 96 targets reached GDT scores above 90 in backbone accuracy.

The result stunned the structural biology community. John Moult, the co-founder and longtime organizer of CASP, stated that the protein structure prediction problem was, in a practical sense, solved. The performance gap between AlphaFold 2 and the second-place team was enormous, with AlphaFold 2 achieving accuracy far beyond what any other method had managed.

Competition	Year	AlphaFold version	Median GDT score	Result
CASP13	2018	AlphaFold 1	~60 (estimated)	1st place
CASP14	2020	AlphaFold 2	92.4	1st place (near-experimental)

Open-source release

The AlphaFold 2 source code and trained model weights were released publicly on July 15, 2021, alongside the publication of the full research paper in Nature. The code was released under the Apache 2.0 license, allowing both academic and commercial use. This open release enabled researchers worldwide to run AlphaFold 2 predictions on their own hardware and to build on the architecture for new applications.

Subsequent updates expanded the system's capabilities. AlphaFold-Multimer, released in versions 2.1 and 2.2, extended the system to predict the structures of protein complexes containing multiple chains. Version 2.3 further improved accuracy on large multi-chain complexes.

The open-source community also produced independent implementations. OpenFold, developed by the OpenFold Consortium, provided a trainable PyTorch reimplementation of AlphaFold 2, giving researchers the ability to retrain the model on different datasets or for different tasks.

AlphaFold Protein Structure Database

In partnership with the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), DeepMind launched the AlphaFold Protein Structure Database (AlphaFold DB) in July 2021. The database provides free, open access to AlphaFold's predicted structures.

Scale and coverage

The initial release covered the complete proteomes of 21 model organisms, including the human proteome. In July 2022, the database was massively expanded to include predicted structures for over 200 million proteins, covering nearly every protein sequence cataloged in the UniProt database. This expansion represented a roughly 500-fold increase from the initial release and effectively provided a structural prediction for almost every known protein.

As of 2024, the database contains over 214 million entries. In 2025, the database was synchronized with the UniProt 2025_03 release and received a comprehensive redesign of its entry pages, integrating annotations directly with an interactive 3D structure viewer.

Homodimer predictions

In early 2026, the AlphaFold DB expanded to include predictions of protein complexes for the first time. The initial addition consisted of 1.7 million high-confidence homodimer predictions (complexes of two identical protein chains). These homodimer structures are drawn from a broader set of 30 million complex predictions computed by DeepMind and EMBL-EBI.

Database milestone	Date	Entries
Initial launch (21 model organisms)	July 2021	~365,000
Major expansion	July 2022	200+ million
UniProt-aligned update	2024	214+ million
Homodimer predictions added	2026	1.7 million complexes added

Usage statistics

The database has been accessed by over two million researchers from more than 190 countries. By November 2025, the AlphaFold 3 paper alone had been cited over 9,000 times, reflecting the broad adoption of AlphaFold across biological sciences.

AlphaFold 3 (2024)

AlphaFold 3 was announced on May 8, 2024, co-developed by Google DeepMind and Isomorphic Labs, a drug discovery company spun out of DeepMind.

Expanded scope

While AlphaFold 2 focused primarily on predicting the structures of individual proteins or protein complexes, AlphaFold 3 broadened the scope to predict structures involving proteins together with other biomolecules. It can model:

Protein-protein complexes
Protein-DNA interactions
Protein-RNA interactions
Protein-ligand (small molecule) binding
Post-translational modifications
Interactions with ions

This expanded capability is particularly important for drug discovery, where understanding how a protein interacts with a small-molecule drug candidate is essential.

Performance improvements

For interactions between proteins and other molecule types, AlphaFold 3 showed at least a 50% improvement over existing prediction methods. In some categories of molecular interaction, accuracy doubled compared to previous approaches.

Architecture changes

AlphaFold 3 introduced a diffusion-based generative approach for the structure module, replacing the direct coordinate prediction used in AlphaFold 2. This diffusion model, similar in concept to those used in image generation systems, generates atomic coordinates by iteratively denoising a random initial configuration. The diffusion approach allows the model to better handle the increased complexity of multi-molecule systems and can represent structural uncertainty more naturally.

Release and accessibility

AlphaFold 3 was initially made available through the AlphaFold Server, a free web-based interface that allows researchers to submit prediction jobs without needing local computational resources. The source code and model weights were released in stages: they were made available to the scientific community for non-commercial use in November 2024, and became publicly accessible on GitHub in February 2025, though still under a non-commercial license.

The OpenFold Consortium at Columbia University separately developed OpenFold 3, an independent open-source reimplementation aiming to reproduce AlphaFold 3's results without commercial restrictions.

Nobel Prize in Chemistry 2024

On October 9, 2024, the Royal Swedish Academy of Sciences awarded the Nobel Prize in Chemistry to three scientists for their work on computational protein science. One half of the prize went to David Baker of the University of Washington for computational protein design. The other half was shared by Demis Hassabis and John Jumper of Google DeepMind for protein structure prediction with AlphaFold.

The prize, worth 11 million Swedish kronor (approximately $1 million USD), recognized what the Nobel Committee described as solving a 50-year-old problem in biology. The committee cited Levinthal's paradox and the long history of the protein folding problem as context for the significance of the achievement.

Hassabis, who co-founded DeepMind in 2010 and serves as its CEO, had originally trained as a neuroscientist and game designer. Jumper, a senior research scientist at DeepMind, holds a PhD in chemistry from the University of Chicago and led the technical development of AlphaFold 2.

Nobel Prize in Chemistry 2024
Laureate	Affiliation	Contribution
David Baker	University of Washington	Computational protein design
Demis Hassabis	Google DeepMind / Isomorphic Labs	Protein structure prediction (AlphaFold)
John Jumper	Google DeepMind	Protein structure prediction (AlphaFold)

Impact on biology and drug discovery

Structural biology

AlphaFold has accelerated structural biology research by providing instant structural hypotheses for proteins that previously had no known structure. Researchers use AlphaFold predictions to guide experimental work, design better crystallization constructs, interpret cryo-EM density maps, and identify functional sites on proteins.

The system has proven particularly useful for organisms whose proteins have been studied less extensively. For many bacterial and archaeal species, the AlphaFold database provides the only available structural information for the majority of their proteomes.

Drug discovery

Isomorphic Labs, the drug discovery company founded by Hassabis in 2021, applies AlphaFold and related AI technology to pharmaceutical research. In January 2024, Isomorphic announced partnerships with Eli Lilly and Novartis worth a combined $3 billion in potential milestone payments. The Lilly deal included $45 million upfront with over $1.7 billion in milestones, while the Novartis deal included $37.5 million upfront with $1.2 billion in potential milestones. In February 2025, Novartis expanded this partnership with additional research programs, and Isomorphic raised $600 million in its first financing round in March 2025.

Beyond Isomorphic, the broader pharmaceutical industry has rapidly adopted AI-driven structure prediction. Companies and academic labs use AlphaFold predictions to identify binding sites, screen drug candidates computationally, and design molecules targeting previously "undruggable" proteins.

Other applications

AlphaFold's impact extends beyond traditional drug discovery:

Enzyme engineering: Researchers have used AlphaFold to identify and design enzymes for industrial applications, including enzymes that can decompose plastic.
Agriculture: Predicted structures help researchers understand plant pathogen interactions and design more resistant crop varieties.
Evolutionary biology: The database of predicted structures allows comparative structural analysis across entire evolutionary lineages.
Antibiotic resistance: AlphaFold structures have been used to study the mechanisms by which bacteria develop resistance to antibiotics.

Limitations

Despite its achievements, AlphaFold has notable limitations that researchers must account for:

Static structures. AlphaFold predicts a single static structure for each protein, but real proteins are dynamic molecules that adopt multiple conformations as part of their function. For proteins known to switch between different conformational states, AlphaFold typically predicts only one of these states, usually the one best represented in training data.

Intrinsically disordered regions. Many proteins contain regions that do not fold into stable structures. AlphaFold assigns low confidence scores to these intrinsically disordered regions, which is useful as a diagnostic, but it cannot characterize the conformational ensembles that these regions actually adopt.

Confidence calibration. AlphaFold provides per-residue confidence scores (pLDDT), but these scores do not always perfectly correlate with actual accuracy. Users must exercise caution when interpreting low-confidence predictions.

Novel folds. AlphaFold's accuracy depends on the availability of homologous sequences in the MSA. For proteins with very few known relatives ("orphan proteins") or genuinely novel folds not represented in training data, prediction accuracy can decrease.

Ligand and cofactor effects. While AlphaFold 3 can predict protein-ligand interactions, the accuracy of these predictions remains lower than for protein-only structures. Predicting how a protein changes shape upon binding a ligand remains a particular challenge.

Competition and alternatives

AlphaFold's success spurred the development of several alternative protein structure prediction tools:

Tool	Developer	Key features
RoseTTAFold	David Baker lab, University of Washington	Three-track neural network; open-source; extended to model protein-DNA-RNA complexes
ESMFold	Meta AI	Single-sequence input (no MSA needed); 60x faster than AlphaFold 2 for short sequences; 15 billion parameter protein language model
OpenFold	OpenFold Consortium	Trainable open-source reimplementation of AlphaFold 2 in PyTorch
OmegaFold	HeliXon	Single-sequence prediction using protein language models
Boltz-2	MIT / Recursion	Co-folds protein-ligand pairs and predicts binding affinity; announced June 2025
Pearl	Genesis Molecular AI	Interactive model allowing user-guided predictions; aimed at drug discovery

RoseTTAFold, developed by David Baker's laboratory at the University of Washington, uses a three-track architecture that simultaneously processes one-dimensional sequence information, two-dimensional distance maps, and three-dimensional coordinates. ESMFold, developed by Meta AI, takes a different approach by using a large protein language model that eliminates the need for MSA computation entirely, making it dramatically faster for large-scale applications. Meta's ESM Metagenomic Atlas used ESMFold to predict structures for 772 million metagenomic protein sequences.

Ensemble approaches have also emerged. FiveFold, for example, combines predictions from AlphaFold 2, RoseTTAFold, OmegaFold, ESMFold, and EMBER3D to improve accuracy and better capture conformational diversity.

CASP after AlphaFold

AlphaFold's dominance reshaped the CASP competition. DeepMind did not enter CASP15 (2022) or CASP16 (2024) as a competitor, but virtually all top-performing teams in both competitions used AlphaFold or modifications of it as the basis for their predictions.

CASP15 (2022) showed substantial progress in modeling protein complexes, a challenge that goes beyond single-chain structure prediction. CASP16 (2024), held in Punta Cana, Dominican Republic, tested 59 protein targets with 85 domains. The competition revealed that while AlphaFold-based methods dominated overall, they still struggled with certain types of targets, particularly antibody-antigen complexes. The team led by Sandor Vajda and Dima Kozakov won the protein complexes category by combining AlphaFold predictions with the ClusPro docking server, substantially outperforming teams that relied on AlphaFold alone.

The CASP competition itself faced an existential challenge in 2025 when the U.S. National Institutes of Health (NIH) declined to renew its long-running funding grant due to budget cuts. Google DeepMind stepped in with interim funding to keep the competition operational.

Current state (2025-2026)

As of early 2026, the AlphaFold ecosystem continues to expand:

The AlphaFold Protein Structure Database has been updated to include homodimer complex predictions and synchronized with the latest UniProt release.
Isomorphic Labs' partnerships with Eli Lilly and Novartis have advanced from target identification to the generation of preclinical drug candidates, with Phase I clinical trials for AI-designed molecules projected for late 2026.
The research focus in the field has shifted from static structure prediction (which is largely considered solved for most proteins) toward protein dynamics: predicting how proteins move, change shape, and sample multiple conformational states.
New tools like Boltz-2 and Pearl are pushing the boundaries of protein-ligand interaction prediction, directly targeting the needs of drug developers.
Open-source alternatives, including OpenFold 3, continue to provide the research community with unrestricted implementations of AlphaFold-class technology.

The challenge of predicting protein dynamics, conformational ensembles for intrinsically disordered proteins, and the effects of mutations on protein stability and function represents the next frontier for the field that AlphaFold opened.

References

Senior, A. W., et al. "Improved protein structure prediction using potentials from deep learning." *Nature* 577, 706-710 (2020). https://www.nature.com/articles/s41586-019-1923-7
Jumper, J., et al. "Highly accurate protein structure prediction with AlphaFold." *Nature* 596, 583-589 (2021). https://www.nature.com/articles/s41586-021-03819-2
AlQuraishi, M. "AlphaFold at CASP13." *Bioinformatics* 35(22), 4862-4865 (2019). https://academic.oup.com/bioinformatics/article/35/22/4862/5497249
Varadi, M., et al. "AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences." *Nucleic Acids Research* 52(D1), D439-D444 (2024). https://pubmed.ncbi.nlm.nih.gov/37933859/
Abramson, J., et al. "Accurate structure prediction of biomolecular interactions with AlphaFold 3." *Nature* 630, 493-500 (2024). https://blog.google/technology/ai/google-deepmind-isomorphic-alphafold-3-ai-model/
"The Nobel Prize in Chemistry 2024." Royal Swedish Academy of Sciences. https://www.nobelprize.org/prizes/chemistry/2024/press-release/
AlphaFold Protein Structure Database. EMBL-EBI. https://alphafold.ebi.ac.uk/
"AlphaFold database hits 'next level': the AI system now includes protein pairing." *Nature* (2026). https://www.nature.com/articles/d41586-026-00787-3
"AlphaFold Protein Structure Database 2025: a redesigned interface and updated structural coverage." *Nucleic Acids Research* (2025). https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkaf1226/8340156
"Isomorphic Labs kicks off 2024 with two pharmaceutical collaborations." Isomorphic Labs (2024). https://www.isomorphiclabs.com/articles/isomorphic-labs-kicks-off-2024-with-two-pharmaceutical-collaborations
"Four years after AlphaFold's AI 'solved' protein structure, a fierce competition lives on." STAT News (2025). https://www.statnews.com/2025/01/07/casp-protein-structure-prediction-competition-after-alphafold/
"Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry." Google DeepMind (2024). https://deepmind.google/blog/demis-hassabis-john-jumper-awarded-nobel-prize-in-chemistry/
"EMBL-EBI and Google DeepMind renew partnership and release update to AlphaFold Database." EMBL (2025). https://www.embl.org/news/science-technology/google-deepmind-partnership-renewal/
AlphaFold 2 source code. GitHub. https://github.com/google-deepmind/alphafold
AlphaFold 3 source code. GitHub. https://github.com/google-deepmind/alphafold3
"Isomorphic Labs has grand ambitions to 'solve all diseases' with AI." Fortune (2025). https://fortune.com/2025/07/06/deepmind-isomorphic-labs-cure-all-diseases-ai-now-first-human-trials/

Background

AlphaFold 1 (CASP13, 2018)

Architecture

CASP13 results

AlphaFold 2 (CASP14, 2020)

Architecture

CASP14 results

Open-source release

AlphaFold Protein Structure Database

Scale and coverage

Homodimer predictions

Usage statistics

AlphaFold 3 (2024)

Expanded scope

Performance improvements

Architecture changes

Release and accessibility

Nobel Prize in Chemistry 2024

Impact on biology and drug discovery

Structural biology

Drug discovery

Other applications

Limitations

Competition and alternatives

CASP after AlphaFold

Current state (2025-2026)

See also

References

Related Articles

Google DeepMind

AlphaGo

NotebookLM

Agentic Context Engineering

Claude Sonnet 4.5

Computer-use agent

Background

AlphaFold 1 (CASP13, 2018)

Architecture

CASP13 results

AlphaFold 2 (CASP14, 2020)

Architecture

CASP14 results

Open-source release

AlphaFold Protein Structure Database

Scale and coverage

Homodimer predictions

Usage statistics

AlphaFold 3 (2024)

Expanded scope

Performance improvements

Architecture changes

Release and accessibility

Nobel Prize in Chemistry 2024

Impact on biology and drug discovery

Structural biology

Drug discovery

Other applications

Limitations

Competition and alternatives

CASP after AlphaFold

Current state (2025-2026)

See also

References

Related Articles

Google DeepMind

AlphaGo

NotebookLM

Agentic Context Engineering

Claude Sonnet 4.5

Computer-use agent