Protein folding is the physical process by which a polypeptide chain, a linear sequence of amino acids, acquires its functional three-dimensional structure. This process is one of the central phenomena in molecular biology, since a protein's shape determines its function. The question of how and why proteins fold into specific shapes, known as the protein folding problem, has been one of the grand challenges in science for over half a century. Beginning with Christian Anfinsen's thermodynamic hypothesis in the 1960s and Cyrus Levinthal's paradox in 1969, the field has progressed through decades of experimental and computational work to the artificial intelligence breakthroughs of the 2020s, when systems like AlphaFold demonstrated that the structure of most proteins could be predicted from sequence alone with near-experimental accuracy.
Proteins are polymers of amino acids, linked together by peptide bonds into chains that range from a few dozen to several thousand residues in length. In solution, these chains fold spontaneously into compact three-dimensional structures. The protein folding problem encompasses several related questions: What physical forces drive folding? How does the amino acid sequence encode the final structure? Can we predict the structure from the sequence? And can we understand the pathway and kinetics of the folding process itself?
In 1961, Christian Anfinsen and colleagues at the National Institutes of Health demonstrated that the enzyme ribonuclease A (RNaseA) could be fully unfolded (denatured) and would then refold spontaneously into its original, functional shape without any external guidance. This experiment established what became known as Anfinsen's dogma, or the thermodynamic hypothesis: the native (functional) structure of a protein corresponds to the global minimum of its free energy, and all the information needed for a protein to reach this state is contained in its amino acid sequence.
Anfinsen shared the 1972 Nobel Prize in Chemistry for this work. The thermodynamic hypothesis implied that protein structure prediction should, in principle, be solvable: if the sequence contains all the necessary information, then a sufficiently powerful computational method should be able to determine the structure from the sequence alone.
In 1969, Cyrus Levinthal, a molecular biologist at MIT and later Columbia University, pointed out a fundamental puzzle. Each amino acid residue in a protein chain has multiple degrees of rotational freedom around its backbone bonds. For a typical protein of 100 residues, the number of possible conformations is astronomically large, estimated at roughly 10^300 in some calculations. If the protein were to sample conformations randomly at a rate of one per picosecond (10^-12 seconds), it would take longer than the age of the known universe to find the lowest-energy state by brute force.
Yet small proteins actually fold in milliseconds or even microseconds. This discrepancy, known as Levinthal's paradox, demonstrated that protein folding cannot be a random search through conformational space. Instead, the process must follow some kind of directed pathway or funnel.
Levinthal himself suggested that folding might be guided by the rapid formation of local structural elements (such as alpha helices and beta turns), which then constrain the search for the global structure. This idea prefigured the modern energy landscape or "folding funnel" view of the process.
The resolution of Levinthal's paradox came through the energy landscape theory, developed primarily by Joseph Bryngelson, Peter Wolynes, and others during the 1980s and 1990s. In this framework, the protein's energy landscape is visualized as a funnel: a high-dimensional surface with many local minima but an overall downhill slope toward the native state. The funnel shape means that from almost any starting conformation, the chain can find energetically favorable local contacts that guide it progressively toward the folded structure, without needing to sample every possible conformation.
The funnel is not smooth. It contains roughness (local energy barriers and traps) that can slow folding or lead to misfolded intermediates. In some cases, proteins become kinetically trapped in non-native states, which can lead to aggregation and is associated with diseases such as Alzheimer's, Parkinson's, and prion diseases.
| Concept | Year | Key figure(s) | Significance |
|---|---|---|---|
| Anfinsen's dogma | 1961 | Christian Anfinsen | Structure is encoded in sequence; folding is thermodynamically driven |
| Levinthal's paradox | 1969 | Cyrus Levinthal | Random search is impossibly slow; folding must be guided |
| Energy landscape / folding funnel | 1980s-1990s | Bryngelson, Wolynes, Onuchic, Dill | Folding proceeds down an energy funnel, not through random search |
Understanding protein folding has depended on experimental techniques that can determine either the final folded structure or the dynamics of the folding process.
Three main experimental techniques have been used to determine protein structures at atomic resolution:
| Method | Principle | Resolution | Limitations |
|---|---|---|---|
| X-ray crystallography | Diffraction of X-rays by protein crystals | < 2 Angstroms typical | Requires crystallization; captures single conformation |
| Nuclear magnetic resonance (NMR) spectroscopy | Magnetic properties of atomic nuclei in solution | 2-3 Angstroms | Limited to smaller proteins (typically < 40 kDa) |
| Cryo-electron microscopy (cryo-EM) | Imaging frozen protein samples with electron beams | 2-4 Angstroms (recent) | Historically lower resolution; now competitive for large complexes |
The Protein Data Bank (PDB), established in 1971, serves as the central repository for experimentally determined structures. By early 2025, the PDB contained over 220,000 structures, though this represents a small fraction of all known protein sequences.
Techniques for studying the folding process in real time include:
Computational protein structure prediction has a long history, with methods evolving from simple statistical approaches to today's AI systems.
The first computational studies of protein folding used simplified models. In the 1970s and 1980s, researchers developed lattice models that reduced the protein to a chain on a grid, allowing enumeration of conformations for short sequences. These models helped establish theoretical concepts such as the folding funnel but could not make practical structure predictions for real proteins.
Molecular dynamics (MD) simulations, which compute the trajectory of every atom in a protein by numerically integrating Newton's equations of motion, were first applied to proteins in the late 1970s. Martin Karplus, Michael Levitt, and Arieh Warshel received the 2013 Nobel Prize in Chemistry for developing multiscale models for complex chemical systems, including early protein simulations. However, the computational cost of MD is enormous: simulating the microsecond-to-millisecond timescales required for protein folding was not feasible with the hardware available at that time.
Homology modeling (also called comparative modeling) exploits the observation that proteins with similar amino acid sequences tend to adopt similar three-dimensional structures. If a target protein shares significant sequence similarity (typically above 30%) with a protein whose structure has already been experimentally determined, the known structure can serve as a template for building a model of the target.
Homology modeling became the most reliable computational method for structure prediction from the 1990s onward. Programs such as MODELLER (developed by Andrej Sali) and SWISS-MODEL automated much of the process. The method's accuracy depends heavily on the degree of sequence similarity between target and template: at high similarity (>50%), models are often reliable enough to guide experimental design; at lower similarity (<30%), model accuracy degrades substantially.
Threading methods, also known as fold recognition, address cases where the target protein has no detectable sequence homolog with a known structure but might still adopt a fold similar to one in the structural database. Threading algorithms evaluate how well a target sequence fits into each known structural fold by scoring sequence-structure compatibility. Tools such as RAPTOR and I-TASSER (developed by Yang Zhang, who won multiple CASP competitions) became leaders in this category.
For proteins with no detectable relationship to any known structure, ab initio (or de novo) methods attempt to predict the structure from physical principles alone. Rosetta, developed by David Baker's laboratory at the University of Washington beginning in the late 1990s, became the dominant approach in this category. Rosetta uses a fragment assembly strategy: it breaks the target sequence into short overlapping segments, matches each segment to known structural fragments from the PDB, and then assembles complete structures by combining fragments and optimizing a physics-based energy function.
Rosetta and its community-driven variant, Rosetta@home (using distributed computing from volunteers), achieved some of the best results in CASP competitions prior to the deep learning era. Baker's broader work on computational protein design, which uses Rosetta in reverse (designing sequences that fold into desired structures), earned him a share of the 2024 Nobel Prize in Chemistry.
| Method | Era | Principle | Strengths | Weaknesses |
|---|---|---|---|---|
| Molecular dynamics | 1970s-present | Physics-based simulation of atomic motion | Captures dynamics; no templates needed | Computationally expensive; limited timescale |
| Homology modeling | 1990s-present | Copy structure from sequence-similar proteins | High accuracy when good templates exist | Fails when no homologs available |
| Threading | 1990s-present | Fit sequence into known structural folds | Works without sequence similarity | Limited to known folds |
| Fragment assembly (Rosetta) | Late 1990s-present | Assemble structures from small structural fragments | No templates needed; physics-based | Computationally demanding; lower accuracy for large proteins |
A landmark in MD simulation came from D. E. Shaw Research, which built a specialized supercomputer called Anton specifically for molecular dynamics. In 2011, the Anton machine simulated the complete folding of several small proteins (up to about 80 amino acids) on millisecond timescales, the first time this had been achieved with all-atom physics-based simulation. While this confirmed that MD force fields could, in principle, fold small proteins, the computational cost remained prohibitive for routine structure prediction of larger proteins.
Modern GPU hardware has made MD more accessible, allowing researchers to run simulations locally at modest cost. However, MD simulations remain primarily a tool for studying protein dynamics and folding mechanisms rather than a practical method for large-scale structure prediction.
The Critical Assessment of Techniques for Protein Structure Prediction (CASP) is a biennial competition that has served as the primary benchmark for the protein structure prediction field since 1994.
CASP was founded by John Moult, a molecular biophysicist at the University of Maryland, with the first competition (CASP1) held in 1994. The format is a blind test: organizers collect protein structures that have been experimentally determined but not yet publicly released, and prediction groups submit their models before the experimental structures are made available. This design prevents any method from being tuned to known answers.
Predictions are scored using the Global Distance Test (GDT), which measures how well the predicted structure superimposes on the experimental structure. A GDT score of 100 means perfect agreement; a score above 90 is generally considered competitive with experimental methods.
Progress in CASP was slow for the first two decades. At CASP1 in 1994, the best median GDT score was around 47. By CASP5 in 2002, scores had climbed to about 60 for template-based predictions, but free modeling (predicting structures without templates) remained poor, with scores often around 40 for the hardest targets. From roughly 2002 to 2016, progress largely stagnated, particularly for the hardest targets.
| CASP round | Year | Notable development | Best approximate GDT (FM targets) |
|---|---|---|---|
| CASP1 | 1994 | Competition founded | ~47 |
| CASP5 | 2002 | Template methods plateau | ~60 (TBM) |
| CASP12 | 2016 | Pre-deep learning plateau | ~40 (FM) |
| CASP13 | 2018 | AlphaFold 1 wins | ~60 (FM) |
| CASP14 | 2020 | AlphaFold 2 achieves 92.4 median GDT | 92.4 (overall) |
| CASP15 | 2022 | Focus shifts to complexes | AlphaFold-derived methods dominate |
| CASP16 | 2024 | Protein-ligand and antibody-antigen challenges | Specialized methods outperform AlphaFold on certain targets |
The CASP13 competition in December 2018 was the first in which deep learning methods made a dramatic impact. DeepMind's AlphaFold 1 won the competition decisively, predicting accurate structures for 24 out of 43 free modeling domains (with TM-scores above 0.7), compared to 14 for the runner-up. The improvement over the previous best results was the largest single-cycle advance in CASP history.
AlphaFold 2 at CASP14 achieved a median GDT of 92.4, a score so high that the protein structure prediction problem was widely declared to be effectively solved for single-chain proteins. The Guinness Book of World Records certified AlphaFold 2's CASP14 score as the highest ever achieved in the competition.
DeepMind did not enter CASP15 (2022) or CASP16 (2024) as a competitor. Both competitions saw virtually all top teams using AlphaFold or AlphaFold-derived methods as their starting point. CASP15 showed strong progress in modeling protein complexes. CASP16 expanded the scope to include protein-ligand interactions and revealed that antibody-antigen complexes remain a challenging frontier where specialized methods can outperform general-purpose AlphaFold predictions.
CASP received long-running funding from the U.S. National Institutes of Health. In 2025, the NIH declined to renew the grant due to budget cuts, putting the competition's future in jeopardy. Google DeepMind provided interim funding to maintain operations.
Starting around 2018, machine learning methods, particularly deep neural networks, transformed protein structure prediction from a slowly improving field into one experiencing rapid, discontinuous advances.
AlphaFold, developed by Google DeepMind, is the system most closely associated with this revolution. AlphaFold 1 (2018) used deep residual networks to predict inter-residue distance distributions. AlphaFold 2 (2020) introduced the Evoformer architecture, a transformer-based system that jointly processes multiple sequence alignments (MSAs) and pairwise residue representations through 48 layers of attention and triangular updates. AlphaFold 2's near-experimental accuracy at CASP14 marked the field's watershed moment. AlphaFold 3 (May 2024) extended the approach to predict structures of protein complexes with DNA, RNA, ligands, and ions, using a diffusion-based structure generation module.
The AlphaFold Protein Structure Database, created in partnership with EMBL-EBI, provides free access to over 214 million predicted structures covering nearly all known protein sequences. Demis Hassabis and John Jumper received the 2024 Nobel Prize in Chemistry for this work.
RoseTTAFold, developed by David Baker's laboratory at the University of Washington and published in July 2021 (the same month as AlphaFold 2's full paper), uses a three-track neural network architecture that simultaneously processes:
The three tracks exchange information at every layer, allowing each representation to inform the others. RoseTTAFold achieves accuracy close to AlphaFold 2 on many targets, though with slightly lower performance on the hardest cases. It has been extended to model multi-chain complexes and protein interactions with DNA and RNA. The system is fully open-source and has been widely adopted in the research community.
ESMFold, developed by Meta AI Research and published in 2022, takes a fundamentally different approach from both AlphaFold and RoseTTAFold. Instead of relying on multiple sequence alignments (MSAs), ESMFold uses a protein language model called ESM-2, with up to 15 billion parameters, that has been trained on millions of protein sequences. The language model learns evolutionary information implicitly from the patterns in protein sequence data, without needing to explicitly construct an MSA for each prediction.
This single-sequence approach offers a major speed advantage. ESMFold is approximately 60 times faster than AlphaFold 2 for short protein sequences. While its accuracy is somewhat lower than AlphaFold 2 when AlphaFold has access to rich MSAs, ESMFold performs comparably or better when the MSA is sparse (as is the case for many metagenomic proteins with few known homologs).
Meta used ESMFold to create the ESM Metagenomic Atlas, predicting structures for 772 million metagenomic protein sequences discovered from environmental DNA sampling, many of which had no close relatives in existing databases.
| AI method | Developer | Year | Key innovation | MSA required? |
|---|---|---|---|---|
| AlphaFold 2 | Google DeepMind | 2020 | Evoformer + structure module | Yes |
| RoseTTAFold | Baker Lab (UW) | 2021 | Three-track neural network | Yes |
| ESMFold | Meta AI | 2022 | Protein language model (no MSA) | No |
| AlphaFold 3 | DeepMind / Isomorphic Labs | 2024 | Diffusion-based; multi-molecule | Yes |
| OmegaFold | HeliXon | 2022 | Protein language model (no MSA) | No |
OmegaFold (HeliXon, 2022) is another single-sequence prediction method using protein language models. Boltz-2, a collaboration between MIT and Recursion announced in June 2025, predicts protein-ligand complex structures and binding affinities. Pearl, from Genesis Molecular AI, is an interactive prediction model designed for drug discovery workflows. FiveFold combines predictions from multiple methods (AlphaFold 2, RoseTTAFold, OmegaFold, ESMFold, and EMBER3D) into consensus ensemble predictions.
The ability to predict protein structures computationally has had far-reaching effects on pharmaceutical research and biological understanding.
Knowing the three-dimensional structure of a drug target protein is fundamental to rational drug design. Before AlphaFold, obtaining these structures required expensive and time-consuming experimental work. Computational prediction now provides structural hypotheses almost instantly, allowing researchers to:
Isomorphic Labs, founded by Demis Hassabis in 2021, applies AlphaFold and related AI technology directly to drug discovery. The company signed partnerships with Eli Lilly and Novartis in January 2024, worth a combined $3 billion in potential milestone payments, and is advancing preclinical candidates toward human trials projected for late 2026.
Protein misfolding is directly implicated in numerous diseases:
| Disease | Misfolded protein | Mechanism |
|---|---|---|
| Alzheimer's disease | Amyloid-beta, tau | Aggregation into plaques and tangles |
| Parkinson's disease | Alpha-synuclein | Formation of Lewy bodies |
| Prion diseases (CJD, BSE) | Prion protein (PrP) | Infectious misfolding cascade |
| Cystic fibrosis | CFTR | Misfolding prevents membrane trafficking |
| Sickle cell disease | Hemoglobin | Polymerization of deoxygenated hemoglobin |
| Type 2 diabetes | IAPP (amylin) | Amyloid formation in pancreatic islets |
AI-predicted structures help researchers understand how these proteins misfold and aggregate, and can guide the design of therapeutic interventions.
Predicted protein structures are also used in enzyme engineering for industrial applications. Researchers have used structural predictions to design enzymes that break down plastics (PET-degrading enzymes), produce biofuels, and catalyze chemical reactions that would otherwise require harsh conditions.
Despite the remarkable progress in structure prediction, several major challenges remain unsolved.
AlphaFold and similar tools predict a single static structure for each protein. Real proteins, however, are dynamic molecules that constantly fluctuate between multiple conformational states. Many proteins function precisely because they can switch between different shapes: enzymes open and close around substrates, signaling proteins toggle between active and inactive conformations, and transporters cycle through multiple states to move molecules across membranes.
Capturing these dynamics computationally remains an open problem. As of 2025, the field has identified protein dynamics and conformational ensemble prediction as the next major frontier. Molecular dynamics simulations can model dynamics but remain computationally expensive. New machine learning approaches are being developed to predict conformational ensembles, but no method yet matches the accuracy that AlphaFold achieved for static structures.
An estimated 30-50% of eukaryotic proteins contain significant intrinsically disordered regions (IDRs), sequences that do not fold into stable three-dimensional structures under physiological conditions. These regions are not random; they adopt defined statistical ensembles of conformations and often participate in binding interactions, signaling, and regulation.
AlphaFold correctly assigns low confidence to these regions (indicating they are not well-predicted), but it cannot characterize the conformational ensembles that IDRs actually adopt. Developing methods to describe IDR ensembles accurately remains an active area of research. Standard MD force fields, which were parameterized to stabilize folded structures, often predict overly compact conformations for IDRs, requiring specialized corrections.
While AlphaFold-Multimer and AlphaFold 3 can predict the structures of specific protein complexes, predicting which proteins interact with each other in the cell (the "interactome") and the structures of large, transient, or heterogeneous complexes remains difficult. Many biologically important complexes involve dozens of protein chains, membrane environments, or post-translational modifications that current methods handle poorly.
AlphaFold 3 can predict protein-ligand interactions, but accuracy remains lower than for protein-only structures. Predicting how a protein changes conformation upon ligand binding (induced fit), and how binding at one site affects distant sites on the same protein (allostery), are particular challenges.
The inverse problem of protein folding, designing amino acid sequences that fold into desired structures, has also seen rapid progress. David Baker's laboratory and others have used deep learning to design proteins with novel functions, including new enzymes, vaccines, and biosensors. However, the success rate of computational designs in the laboratory remains variable, and designing proteins with complex functions (such as multispecific binding or precise catalytic activity) is still difficult.
Proteins embedded in cell membranes (membrane proteins) present special challenges for both experimental structure determination and computational prediction. They require a lipid bilayer or detergent environment for stability, which complicates experimental work. Computational predictions for membrane proteins are generally less accurate than for soluble proteins, although AlphaFold has shown improvement in this area.
The protein folding field in 2025-2026 is characterized by several trends:
Static structure prediction is largely solved. For most single-chain proteins with reasonable MSA coverage, AlphaFold 2 and AlphaFold 3 produce predictions competitive with experimental methods. The research focus has shifted to harder problems: dynamics, complexes, and design.
Drug discovery integration is accelerating. AI-driven structure prediction is being integrated into pharmaceutical pipelines at both large companies and startups. Isomorphic Labs' partnerships with major pharmaceutical companies have advanced to the preclinical candidate stage. New tools like Boltz-2 combine structure prediction with binding affinity estimation, directly targeting drug discovery needs.
The competition landscape continues to evolve. CASP16 (2024) showed that while AlphaFold-derived methods dominate, specialized approaches can outperform them on specific target types, such as antibody-antigen complexes. The post-AlphaFold era of CASP focuses increasingly on multi-molecule complexes and protein-ligand interactions.
Open-source development is thriving. OpenFold, OpenFold 3, RoseTTAFold, and ESMFold provide the community with unrestricted implementations of state-of-the-art methods. This open ecosystem allows academic researchers to train custom models and develop new applications without dependence on proprietary systems.
Dynamics prediction is the next frontier. Multiple research groups are working on methods to predict not just the most stable protein structure, but the full range of conformations a protein can adopt. This includes work on intrinsically disordered proteins, allosteric mechanisms, and the conformational changes associated with enzyme catalysis and molecular recognition.
The protein folding problem, once considered one of the hardest unsolved problems in science, has been substantially addressed for the case of static single-chain structures. The remaining challenges, including dynamics, design, and multi-molecule prediction, define the next chapter of this field.