Protein folding

Artificial Intelligence Machine Learning

23 min read

Updated Jun 21, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 21, 2026

Fact-checked

In review queue

Sources

21 citations

Revision

v5 · 4,514 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Protein folding is the physical process by which a polypeptide chain, a linear sequence of amino acids, acquires its functional three-dimensional structure, and the long-standing scientific challenge of predicting that structure from sequence alone. In 2020, Google DeepMind's AlphaFold 2 effectively solved the single-chain version of this problem, scoring a median Global Distance Test (GDT) of 92.4 out of 100 at the CASP14 competition, a level of accuracy comparable to experimental methods such as X-ray crystallography.^[5] The achievement led to the 2024 Nobel Prize in Chemistry, awarded one half to David Baker "for computational protein design" and the other half jointly to Demis Hassabis and John Jumper "for protein structure prediction."^[10] A protein's shape determines its function, so the question of how and why proteins fold, known as the protein folding problem, has been one of the grand challenges in science for over half a century.^[3]^[4] The field progressed from Christian Anfinsen's thermodynamic hypothesis in the 1960s and Cyrus Levinthal's paradox in 1969 through decades of experimental and computational work to the artificial intelligence breakthroughs of the 2020s, when systems like AlphaFold demonstrated that the structure of most proteins could be predicted from sequence alone with near-experimental accuracy.^[5]

What is the protein folding problem?

Proteins are polymers of amino acids, linked together by peptide bonds into chains that range from a few dozen to several thousand residues in length. In solution, these chains fold spontaneously into compact three-dimensional structures. The protein folding problem encompasses several related questions: What physical forces drive folding? How does the amino acid sequence encode the final structure? Can we predict the structure from the sequence? And can we understand the pathway and kinetics of the folding process itself?^[3]

Anfinsen's thermodynamic hypothesis

In 1961, Christian Anfinsen and colleagues at the National Institutes of Health demonstrated that the enzyme ribonuclease A (RNaseA) could be fully unfolded (denatured) and would then refold spontaneously into its original, functional shape without any external guidance. This experiment established what became known as Anfinsen's dogma, or the thermodynamic hypothesis: the native (functional) structure of a protein corresponds to the global minimum of its free energy, and all the information needed for a protein to reach this state is contained in its amino acid sequence.^[1]

Anfinsen shared the 1972 Nobel Prize in Chemistry for this work.^[1] The thermodynamic hypothesis implied that protein structure prediction should, in principle, be solvable: if the sequence contains all the necessary information, then a sufficiently powerful computational method should be able to determine the structure from the sequence alone.

Levinthal's paradox

In 1969, Cyrus Levinthal, a molecular biologist at MIT and later Columbia University, pointed out a fundamental puzzle. Each amino acid residue in a protein chain has multiple degrees of rotational freedom around its backbone bonds. For a typical protein of 100 residues, the number of possible conformations is astronomically large, estimated at roughly 10^300 in some calculations. If the protein were to sample conformations randomly at a rate of one per picosecond (10^-12 seconds), it would take longer than the age of the known universe to find the lowest-energy state by brute force.^[2]

Yet small proteins actually fold in milliseconds or even microseconds. This discrepancy, known as Levinthal's paradox, demonstrated that protein folding cannot be a random search through conformational space.^[2] Instead, the process must follow some kind of directed pathway or funnel.

Levinthal himself suggested that folding might be guided by the rapid formation of local structural elements (such as alpha helices and beta turns), which then constrain the search for the global structure.^[2] This idea prefigured the modern energy landscape or "folding funnel" view of the process.

The folding funnel

The resolution of Levinthal's paradox came through the energy landscape theory, developed primarily by Joseph Bryngelson, Peter Wolynes, and others during the 1980s and 1990s.^[4] In this framework, the protein's energy landscape is visualized as a funnel: a high-dimensional surface with many local minima but an overall downhill slope toward the native state. The funnel shape means that from almost any starting conformation, the chain can find energetically favorable local contacts that guide it progressively toward the folded structure, without needing to sample every possible conformation.^[4]

The funnel is not smooth. It contains roughness (local energy barriers and traps) that can slow folding or lead to misfolded intermediates. In some cases, proteins become kinetically trapped in non-native states, which can lead to aggregation and is associated with diseases such as Alzheimer's, Parkinson's, and prion diseases.

Concept	Year	Key figure(s)	Significance
Anfinsen's dogma	1961	Christian Anfinsen	Structure is encoded in sequence; folding is thermodynamically driven
Levinthal's paradox	1969	Cyrus Levinthal	Random search is impossibly slow; folding must be guided
Energy landscape / folding funnel	1980s-1990s	Bryngelson, Wolynes, Onuchic, Dill	Folding proceeds down an energy funnel, not through random search

How are protein structures determined experimentally?

Understanding protein folding has depended on experimental techniques that can determine either the final folded structure or the dynamics of the folding process.

Structure determination

Three main experimental techniques have been used to determine protein structures at atomic resolution:

Method	Principle	Resolution	Limitations
X-ray crystallography	Diffraction of X-rays by protein crystals	< 2 Angstroms typical	Requires crystallization; captures single conformation
Nuclear magnetic resonance (NMR) spectroscopy	Magnetic properties of atomic nuclei in solution	2-3 Angstroms	Limited to smaller proteins (typically < 40 kDa)
Cryo-electron microscopy (cryo-EM)	Imaging frozen protein samples with electron beams	2-4 Angstroms (recent)	Historically lower resolution; now competitive for large complexes

The Protein Data Bank (PDB), established in 1971, serves as the central repository for experimentally determined structures. By early 2025, the PDB contained over 220,000 structures, though this represents a small fraction of all known protein sequences. For comparison, the AlphaFold Protein Structure Database holds over 200 million predicted structures, roughly 1,000 times more than have ever been determined experimentally.^[21]

Folding kinetics

Techniques for studying the folding process in real time include:

Stopped-flow and continuous-flow mixing, which initiate folding by rapidly diluting denaturant and monitor the process on millisecond timescales.
Hydrogen-deuterium exchange, which tracks which parts of the protein become structured at different time points.
Single-molecule fluorescence resonance energy transfer (smFRET), which measures distances within individual protein molecules as they fold.
Temperature-jump experiments, which use rapid heating (often by laser) to perturb the folding equilibrium and watch the protein respond.

How did computational protein structure prediction evolve?

Computational protein structure prediction has a long history, with methods evolving from simple statistical approaches to today's AI systems.

Early computational methods (1960s-1980s)

The first computational studies of protein folding used simplified models. In the 1970s and 1980s, researchers developed lattice models that reduced the protein to a chain on a grid, allowing enumeration of conformations for short sequences. These models helped establish theoretical concepts such as the folding funnel but could not make practical structure predictions for real proteins.

Molecular dynamics (MD) simulations, which compute the trajectory of every atom in a protein by numerically integrating Newton's equations of motion, were first applied to proteins in the late 1970s.^[14] Martin Karplus, Michael Levitt, and Arieh Warshel received the 2013 Nobel Prize in Chemistry for developing multiscale models for complex chemical systems, including early protein simulations. However, the computational cost of MD is enormous: simulating the microsecond-to-millisecond timescales required for protein folding was not feasible with the hardware available at that time.^[14]

Homology modeling

Homology modeling (also called comparative modeling) exploits the observation that proteins with similar amino acid sequences tend to adopt similar three-dimensional structures. If a target protein shares significant sequence similarity (typically above 30%) with a protein whose structure has already been experimentally determined, the known structure can serve as a template for building a model of the target.

Homology modeling became the most reliable computational method for structure prediction from the 1990s onward. Programs such as MODELLER (developed by Andrej Sali) and SWISS-MODEL automated much of the process. The method's accuracy depends heavily on the degree of sequence similarity between target and template: at high similarity (>50%), models are often reliable enough to guide experimental design; at lower similarity (<30%), model accuracy degrades substantially.

Threading (fold recognition)

Threading methods, also known as fold recognition, address cases where the target protein has no detectable sequence homolog with a known structure but might still adopt a fold similar to one in the structural database. Threading algorithms evaluate how well a target sequence fits into each known structural fold by scoring sequence-structure compatibility. Tools such as RAPTOR and I-TASSER (developed by Yang Zhang, who won multiple CASP competitions) became leaders in this category.

Ab initio and fragment assembly methods

For proteins with no detectable relationship to any known structure, ab initio (or de novo) methods attempt to predict the structure from physical principles alone. Rosetta, developed by David Baker's laboratory at the University of Washington beginning in the late 1990s, became the dominant approach in this category. Rosetta uses a fragment assembly strategy: it breaks the target sequence into short overlapping segments, matches each segment to known structural fragments from the PDB, and then assembles complete structures by combining fragments and optimizing a physics-based energy function.

Rosetta and its community-driven variant, Rosetta@home (using distributed computing from volunteers), achieved some of the best results in CASP competitions prior to the deep learning era. Baker's broader work on computational protein design, which uses Rosetta in reverse (designing sequences that fold into desired structures), earned him a share of the 2024 Nobel Prize in Chemistry.^[10]

Method	Era	Principle	Strengths	Weaknesses
Molecular dynamics	1970s-present	Physics-based simulation of atomic motion	Captures dynamics; no templates needed	Computationally expensive; limited timescale
Homology modeling	1990s-present	Copy structure from sequence-similar proteins	High accuracy when good templates exist	Fails when no homologs available
Threading	1990s-present	Fit sequence into known structural folds	Works without sequence similarity	Limited to known folds
Fragment assembly (Rosetta)	Late 1990s-present	Assemble structures from small structural fragments	No templates needed; physics-based	Computationally demanding; lower accuracy for large proteins

Molecular dynamics at scale

A landmark in MD simulation came from D. E. Shaw Research, which built a specialized supercomputer called Anton specifically for molecular dynamics. In 2011, the Anton machine simulated the complete folding of several small proteins (up to about 80 amino acids) on millisecond timescales, the first time this had been achieved with all-atom physics-based simulation.^[15] While this confirmed that MD force fields could, in principle, fold small proteins, the computational cost remained prohibitive for routine structure prediction of larger proteins.^[15]

Modern GPU hardware has made MD more accessible, allowing researchers to run simulations locally at modest cost. However, MD simulations remain primarily a tool for studying protein dynamics and folding mechanisms rather than a practical method for large-scale structure prediction.

What is the CASP competition?

The Critical Assessment of Techniques for Protein Structure Prediction (CASP) is a biennial competition that has served as the primary benchmark for the protein structure prediction field since 1994.^[8]

Origins and format

CASP was founded by John Moult, a molecular biophysicist at the University of Maryland, with the first competition (CASP1) held in 1994.^[8] The format is a blind test: organizers collect protein structures that have been experimentally determined but not yet publicly released, and prediction groups submit their models before the experimental structures are made available. This design prevents any method from being tuned to known answers.^[8]

Predictions are scored using the Global Distance Test (GDT), which measures how well the predicted structure superimposes on the experimental structure. A GDT score of 100 means perfect agreement; a score above 90 is generally considered competitive with experimental methods.

Progress over time

Progress in CASP was slow for the first two decades. At CASP1 in 1994, the best median GDT score was around 47. By CASP5 in 2002, scores had climbed to about 60 for template-based predictions, but free modeling (predicting structures without templates) remained poor, with scores often around 40 for the hardest targets. From roughly 2002 to 2016, progress largely stagnated, particularly for the hardest targets.^[9]

CASP round	Year	Notable development	Best approximate GDT (FM targets)
CASP1	1994	Competition founded	~47
CASP5	2002	Template methods plateau	~60 (TBM)
CASP12	2016	Pre-deep learning plateau	~40 (FM)
CASP13	2018	AlphaFold 1 wins	~60 (FM)
CASP14	2020	AlphaFold 2 achieves 92.4 median GDT	92.4 (overall)
CASP15	2022	Focus shifts to complexes	AlphaFold-derived methods dominate
CASP16	2024	Protein-ligand and antibody-antigen challenges	Specialized methods outperform AlphaFold on certain targets

CASP13 (2018): AlphaFold enters

The CASP13 competition in December 2018 was the first in which deep learning methods made a dramatic impact. DeepMind's AlphaFold 1 won the competition decisively, predicting accurate structures for 24 out of 43 free modeling domains (with TM-scores above 0.7), compared to 14 for the runner-up.^[9] The improvement over the previous best results was the largest single-cycle advance in CASP history.

CASP14 (2020): AlphaFold 2 solves the problem

AlphaFold 2 at CASP14 achieved a median GDT of 92.4, a score so high that the protein structure prediction problem was widely declared to be effectively solved for single-chain proteins.^[5] The Guinness Book of World Records certified AlphaFold 2's CASP14 score as the highest ever achieved in the competition. In their 2021 Nature paper, the AlphaFold team described the system as "the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known."^[5]

CASP15 and CASP16: the post-AlphaFold era

DeepMind did not enter CASP15 (2022) or CASP16 (2024) as a competitor. Both competitions saw virtually all top teams using AlphaFold or AlphaFold-derived methods as their starting point.^[9] CASP15 showed strong progress in modeling protein complexes.^[19] CASP16 expanded the scope to include protein-ligand interactions and revealed that antibody-antigen complexes remain a challenging frontier where specialized methods can outperform general-purpose AlphaFold predictions.

Funding crisis

CASP received long-running funding from the U.S. National Institutes of Health. In 2025, the NIH declined to renew the grant due to budget cuts, putting the competition's future in jeopardy. Google DeepMind provided interim funding to maintain operations.^[9]

How did AI transform protein structure prediction?

Starting around 2018, machine learning methods, particularly deep neural networks, transformed protein structure prediction from a slowly improving field into one experiencing rapid, discontinuous advances.

AlphaFold

AlphaFold, developed by Google DeepMind, is the system most closely associated with this revolution. AlphaFold 1 (2018) used deep residual networks to predict inter-residue distance distributions. AlphaFold 2 (2020) introduced the Evoformer architecture, a transformer-based system that jointly processes multiple sequence alignments (MSAs) and pairwise residue representations through 48 layers of attention and triangular updates.^[5] AlphaFold 2's near-experimental accuracy at CASP14 marked the field's watershed moment.^[5] AlphaFold 3 (May 2024) extended the approach to predict structures of protein complexes with DNA, RNA, ligands, and ions, using a diffusion-based structure generation module.^[12]

The AlphaFold Protein Structure Database, created in partnership with EMBL-EBI, provides free access to over 200 million predicted structures covering nearly all known protein sequences. In July 2022, DeepMind expanded the database by over 200 times, "from nearly 1 million structures to over 200 million structures."^[21] Demis Hassabis and John Jumper received the 2024 Nobel Prize in Chemistry for this work, with the Royal Swedish Academy of Sciences noting they had "cracked a 50-year-old problem."^[10]

RoseTTAFold

RoseTTAFold, developed by David Baker's laboratory at the University of Washington and published in July 2021 (the same month as AlphaFold 2's full paper), uses a three-track neural network architecture that simultaneously processes:^[6]

One-dimensional sequence information
Two-dimensional distance/interaction maps
Three-dimensional coordinate information

The three tracks exchange information at every layer, allowing each representation to inform the others.^[6] RoseTTAFold achieves accuracy close to AlphaFold 2 on many targets, though with slightly lower performance on the hardest cases.^[6] It has been extended to model multi-chain complexes and protein interactions with DNA and RNA. The system is fully open-source and has been widely adopted in the research community.

ESMFold

ESMFold, developed by Meta AI Research and published in 2022, takes a fundamentally different approach from both AlphaFold and RoseTTAFold. Instead of relying on multiple sequence alignments (MSAs), ESMFold uses a protein language model called ESM-2, with up to 15 billion parameters, that has been trained on millions of protein sequences.^[7] The language model learns evolutionary information implicitly from the patterns in protein sequence data, without needing to explicitly construct an MSA for each prediction.^[7]

This single-sequence approach offers a major speed advantage. ESMFold is approximately 60 times faster than AlphaFold 2 for short protein sequences.^[7] While its accuracy is somewhat lower than AlphaFold 2 when AlphaFold has access to rich MSAs, ESMFold performs comparably or better when the MSA is sparse (as is the case for many metagenomic proteins with few known homologs).

Meta used ESMFold to create the ESM Metagenomic Atlas, predicting structures for 772 million metagenomic protein sequences discovered from environmental DNA sampling, many of which had no close relatives in existing databases.^[7]

AI method	Developer	Year	Key innovation	MSA required?
AlphaFold 2	Google DeepMind	2020	Evoformer + structure module	Yes
RoseTTAFold	Baker Lab (UW)	2021	Three-track neural network	Yes
ESMFold	Meta AI	2022	Protein language model (no MSA)	No
AlphaFold 3	DeepMind / Isomorphic Labs	2024	Diffusion-based; multi-molecule	Yes
OmegaFold	HeliXon	2022	Protein language model (no MSA)	No

Other notable methods

OmegaFold (HeliXon, 2022) is another single-sequence prediction method using protein language models. Boltz-2, a collaboration between MIT and Recursion announced in June 2025, predicts protein-ligand complex structures and binding affinities. Pearl, from Genesis Molecular AI, is an interactive prediction model designed for drug discovery workflows. FiveFold combines predictions from multiple methods (AlphaFold 2, RoseTTAFold, OmegaFold, ESMFold, and EMBER3D) into consensus ensemble predictions.

What is the impact on drug discovery and medicine?

The ability to predict protein structures computationally has had far-reaching effects on pharmaceutical research and biological understanding.

Structure-based drug design

Knowing the three-dimensional structure of a drug target protein is fundamental to rational drug design. Before AlphaFold, obtaining these structures required expensive and time-consuming experimental work. Computational prediction now provides structural hypotheses almost instantly, allowing researchers to:^[12]^[13]

Identify potential binding sites on target proteins
Screen millions of candidate drug molecules computationally (virtual screening)
Design molecules that fit into specific protein pockets
Understand how mutations in disease-related proteins alter structure and function

Isomorphic Labs, founded by Demis Hassabis in 2021, applies AlphaFold and related AI technology directly to drug discovery. The company signed partnerships with Eli Lilly and Novartis in January 2024, worth a combined $3 billion in potential milestone payments, and is advancing preclinical candidates toward human trials projected for late 2026.^[18]

Understanding disease mechanisms

Protein misfolding is directly implicated in numerous diseases:

Disease	Misfolded protein	Mechanism
Alzheimer's disease	Amyloid-beta, tau	Aggregation into plaques and tangles
Parkinson's disease	Alpha-synuclein	Formation of Lewy bodies
Prion diseases (CJD, BSE)	Prion protein (PrP)	Infectious misfolding cascade
Cystic fibrosis	CFTR	Misfolding prevents membrane trafficking
Sickle cell disease	Hemoglobin	Polymerization of deoxygenated hemoglobin
Type 2 diabetes	IAPP (amylin)	Amyloid formation in pancreatic islets

AI-predicted structures help researchers understand how these proteins misfold and aggregate, and can guide the design of therapeutic interventions.^[17]

Enzyme engineering

Predicted protein structures are also used in enzyme engineering for industrial applications. Researchers have used structural predictions to design enzymes that break down plastics (PET-degrading enzymes), produce biofuels, and catalyze chemical reactions that would otherwise require harsh conditions.

What challenges remain unsolved?

Despite the remarkable progress in structure prediction, several major challenges remain unsolved.

Protein dynamics and conformational ensembles

AlphaFold and similar tools predict a single static structure for each protein. Real proteins, however, are dynamic molecules that constantly fluctuate between multiple conformational states.^[11] Many proteins function precisely because they can switch between different shapes: enzymes open and close around substrates, signaling proteins toggle between active and inactive conformations, and transporters cycle through multiple states to move molecules across membranes.

Capturing these dynamics computationally remains an open problem. As of 2025, the field has identified protein dynamics and conformational ensemble prediction as the next major frontier.^[11] Molecular dynamics simulations can model dynamics but remain computationally expensive. New machine learning approaches are being developed to predict conformational ensembles, but no method yet matches the accuracy that AlphaFold achieved for static structures.^[11]

Intrinsically disordered proteins

An estimated 30-50% of eukaryotic proteins contain significant intrinsically disordered regions (IDRs), sequences that do not fold into stable three-dimensional structures under physiological conditions. These regions are not random; they adopt defined statistical ensembles of conformations and often participate in binding interactions, signaling, and regulation.^[20]

AlphaFold correctly assigns low confidence to these regions (indicating they are not well-predicted), but it cannot characterize the conformational ensembles that IDRs actually adopt. Developing methods to describe IDR ensembles accurately remains an active area of research.^[20] Standard MD force fields, which were parameterized to stabilize folded structures, often predict overly compact conformations for IDRs, requiring specialized corrections.

Protein-protein interactions at scale

While AlphaFold-Multimer and AlphaFold 3 can predict the structures of specific protein complexes, predicting which proteins interact with each other in the cell (the "interactome") and the structures of large, transient, or heterogeneous complexes remains difficult. Many biologically important complexes involve dozens of protein chains, membrane environments, or post-translational modifications that current methods handle poorly.

Ligand binding and allosteric effects

AlphaFold 3 can predict protein-ligand interactions, but accuracy remains lower than for protein-only structures.^[12] Predicting how a protein changes conformation upon ligand binding (induced fit), and how binding at one site affects distant sites on the same protein (allostery), are particular challenges.

Protein design

The inverse problem of protein folding, designing amino acid sequences that fold into desired structures, has also seen rapid progress. David Baker's laboratory and others have used deep learning to design proteins with novel functions, including new enzymes, vaccines, and biosensors.^[10] However, the success rate of computational designs in the laboratory remains variable, and designing proteins with complex functions (such as multispecific binding or precise catalytic activity) is still difficult.

Membrane proteins

Proteins embedded in cell membranes (membrane proteins) present special challenges for both experimental structure determination and computational prediction. They require a lipid bilayer or detergent environment for stability, which complicates experimental work. Computational predictions for membrane proteins are generally less accurate than for soluble proteins, although AlphaFold has shown improvement in this area.

Current state (2025-2026)

The protein folding field in 2025-2026 is characterized by several trends:

Static structure prediction is largely solved. For most single-chain proteins with reasonable MSA coverage, AlphaFold 2 and AlphaFold 3 produce predictions competitive with experimental methods.^[5] The research focus has shifted to harder problems: dynamics, complexes, and design.

Drug discovery integration is accelerating. AI-driven structure prediction is being integrated into pharmaceutical pipelines at both large companies and startups.^[16] Isomorphic Labs' partnerships with major pharmaceutical companies have advanced to the preclinical candidate stage.^[18] New tools like Boltz-2 combine structure prediction with binding affinity estimation, directly targeting drug discovery needs.

The competition landscape continues to evolve. CASP16 (2024) showed that while AlphaFold-derived methods dominate, specialized approaches can outperform them on specific target types, such as antibody-antigen complexes.^[9] The post-AlphaFold era of CASP focuses increasingly on multi-molecule complexes and protein-ligand interactions.

Open-source development is thriving. OpenFold, OpenFold 3, RoseTTAFold, and ESMFold provide the community with unrestricted implementations of state-of-the-art methods. This open ecosystem allows academic researchers to train custom models and develop new applications without dependence on proprietary systems.

Dynamics prediction is the next frontier. Multiple research groups are working on methods to predict not just the most stable protein structure, but the full range of conformations a protein can adopt.^[11] This includes work on intrinsically disordered proteins, allosteric mechanisms, and the conformational changes associated with enzyme catalysis and molecular recognition.

The protein folding problem, once considered one of the hardest unsolved problems in science, has been substantially addressed for the case of static single-chain structures.^[5] The remaining challenges, including dynamics, design, and multi-molecule prediction, define the next chapter of this field.

References

Anfinsen, C. B. "Principles that govern the folding of protein chains." *Science* 181(4096), 223-230 (1973). https://www.science.org/doi/10.1126/science.181.4096.223 ↩
Levinthal, C. "How to fold graciously." *Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House*, 22-24 (1969). https://en.wikipedia.org/wiki/Levinthal%27s_paradox ↩
"What is the protein folding problem?" EMBL-EBI AlphaFold training. https://www.ebi.ac.uk/training/online/courses/alphafold/an-introductory-guide-to-its-strengths-and-limitations/what-is-the-protein-folding-problem/ ↩
Dill, K. A. and MacCallum, J. L. "The Protein-Folding Problem, 50 Years On." *Science* 338(6110), 1042-1046 (2012). https://pmc.ncbi.nlm.nih.gov/articles/PMC2443096/ ↩
Jumper, J., et al. "Highly accurate protein structure prediction with AlphaFold." *Nature* 596, 583-589 (2021). https://www.nature.com/articles/s41586-021-03819-2 ↩
Baek, M., et al. "Accurate prediction of protein structures and interactions using a three-track neural network." *Science* 373(6557), 871-876 (2021). https://www.science.org/doi/10.1126/science.abj8754 ↩
Lin, Z., et al. "Evolutionary-scale prediction of atomic-level protein structure with a language model." *Science* 379(6637), 1123-1130 (2023). https://cbirt.net/metas-ai-protein-structure-prediction-model-esmfold-predicts-as-accurately-and-6x-faster-than-alphafold2/ ↩
"CASP - Critical Assessment of Methods of Protein Structure Prediction." https://en.wikipedia.org/wiki/CASP ↩
"Four years after AlphaFold's AI 'solved' protein structure, a fierce competition lives on." STAT News (2025). https://www.statnews.com/2025/01/07/casp-protein-structure-prediction-competition-after-alphafold/ ↩
"The Nobel Prize in Chemistry 2024." Royal Swedish Academy of Sciences. https://www.nobelprize.org/prizes/chemistry/2024/press-release/ ↩
"AlphaFold and protein folding: Not dead yet! The frontier is conformational ensembles." *PMC* (2025). https://pmc.ncbi.nlm.nih.gov/articles/PMC11892350/ ↩
"Review of AlphaFold 3: Transformative Advances in Drug Design and Therapeutics." *PMC* (2024). https://pmc.ncbi.nlm.nih.gov/articles/PMC11292590/ ↩
"Advances in AI for Protein Structure Prediction: Implications for Cancer Drug Discovery and Development." *Biomolecules* 14(3), 339 (2024). https://pmc.ncbi.nlm.nih.gov/articles/PMC10968151/ ↩
"Protein-folding dynamics: overview of molecular simulation techniques." *Annual Review of Physical Chemistry* 58, 57-83 (2007). https://pubmed.ncbi.nlm.nih.gov/17034338/ ↩
Lindorff-Larsen, K., et al. "How fast-folding proteins fold." *Science* 334(6055), 517-520 (2011). https://pmc.ncbi.nlm.nih.gov/articles/PMC3109318/ ↩
"Drug discovery and development in 2026." Drug Discovery World (2026). https://www.ddw-online.com/drug-discovery-and-development-in-2026-40059-202601/ ↩
"Protein folding interference: a new path to hard-to-drug targets." Drug Target Review (2025). https://www.drugtargetreview.com/article/191026/protein-folding-interference-a-new-path-to-hard-to-drug-targets/ ↩
"Isomorphic Labs kicks off 2024 with two pharmaceutical collaborations." Isomorphic Labs (2024). https://www.isomorphiclabs.com/articles/isomorphic-labs-kicks-off-2024-with-two-pharmaceutical-collaborations ↩
"Critical assessment of methods of protein structure prediction (CASP), Round XV." *Proteins* (2023). https://onlinelibrary.wiley.com/doi/full/10.1002/prot.26617 ↩
"Toward a unified framework for determining conformational ensembles of disordered proteins." *Nature Methods* (2026). https://www.nature.com/articles/s41592-026-03003-2 ↩
"AlphaFold reveals the structure of the protein universe." Google DeepMind (July 2022). https://deepmind.google/blog/alphafold-reveals-the-structure-of-the-protein-universe/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

4 revisions by 1 contributors · full history

Suggest edit