AlphaFold-Multimer
Last reviewed
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,671 words
AlphaFold-Multimer
AlphaFold-Multimer is a deep learning system for predicting the three-dimensional structures of protein complexes, released by Google DeepMind in October 2021 as an extension of AlphaFold 2. It modifies the AlphaFold 2 architecture and training procedure to handle multi-chain inputs of known stoichiometry, introducing chain-aware feature construction, paired multiple-sequence alignments (MSAs), and a chain-permutation-invariant loss. On a curated heterodimer benchmark, AlphaFold-Multimer raised the rate of acceptable interface predictions (DockQ greater than 0.23) from roughly 23 percent for AlphaFold 2 with a residue-index-jump trick to about 67 percent.[1] The system became the de facto open-source tool for protein-protein complex prediction between late 2021 and the release of AlphaFold 3 in 2024, which superseded it on most benchmarks.[2]
Infobox
| Attribute | Value |
|---|---|
| Developer | Google DeepMind[1] |
| First preprint | bioRxiv 2021-10-04, doi:10.1101/2021.10.04.463034[1] |
| Updated preprint | 2022-03-10 (v2)[3] |
| First public weights | AlphaFold v2.1.0, January 2022[4] |
| Current weights series | Multimer v3 (in AlphaFold v2.3.0, December 2022)[5] |
| Code repository | github.com/google-deepmind/alphafold[6] |
| Code license | Apache 2.0[6] |
| Parameter license | CC BY 4.0[6] |
| Superseded by | AlphaFold 3 (Abramson et al., Nature 2024)[2] |
History
Precursor: AlphaFold 2 and the monomer focus
AlphaFold 2, presented at the 14th Critical Assessment of Structure Prediction (CASP14) in late 2020 and published in Nature in July 2021, set a new standard for single-chain protein structure prediction. The system used a novel Evoformer block operating jointly on an MSA representation and a pair representation, followed by an SE(3)-equivariant structure module that produced full atomic coordinates.[7] Although AlphaFold 2 was trained on single-chain crops, several groups noted that concatenating two chains with a residue-index gap (the "linker hack" or "chain-merged" trick) sometimes recovered reasonable complex models, prompting a series of community adaptations.[1]
Release of AlphaFold-Multimer
DeepMind formalised multi-chain prediction with the preprint "Protein complex prediction with AlphaFold-Multimer," posted to bioRxiv on 4 October 2021 (doi:10.1101/2021.10.04.463034) by Richard Evans, Michael O'Neill, Alexander Pritzel, Natasha Antropova, Andrew Senior, Tim Green, Augustin Zidek, Russ Bates, Sam Blackwell, Jason Yim, Olaf Ronneberger, Sebastian Bodenstein, Michal Zielinski, Alex Bridgland, Anna Potapenko, Andrew Cowie, Kathryn Tunyasuvunakool, Rishub Jain, Ellen Clancy, Pushmeet Kohli, John Jumper and Demis Hassabis.[1] An updated version (v2) was posted on 10 March 2022 alongside the v2.2 weights release.[3]
Open-source code and weights
Code and the first set of multimer weights shipped in the google-deepmind/alphafold repository with the v2.1.0 release in January 2022.[4][6] DeepMind subsequently published two weight refreshes:
- Multimer v2 (v2.2.0, March 2022): new model parameters with greatly reduced clashes and slightly higher accuracy, plus support for multiple seeds per model and a fix for
num_recycle=0.[8] - Multimer v3 (v2.3.0, December 2022): model parameters retrained with a 30 September 2021 cutoff (roughly 30 percent more PDB data, with a strong emphasis on large complexes), training crops increased from 384 to 640 residues, the per-chain training sample bumped from 8 to 20 chains, and the MSA input expanded from 1,152 to 2,048 sequences in three of the five models.[5]
A patch release v2.3.2 in April 2024 added quality-of-life flags but did not change the multimer weights.[8]
Adoption and supersession
Through 2022 and 2023 AlphaFold-Multimer was integrated into community tooling (notably ColabFold, which packaged the multimer weights with MMseqs2-based MSAs[9]), and it dominated the CASP15-CAPRI Round 54 assembly challenge in 2022, where almost every top group built on the multimer weights and high-quality models were produced for about 40 percent of targets versus 8 percent two years earlier.[10] In May 2024 DeepMind and Isomorphic Labs published AlphaFold 3 in Nature, a diffusion-based successor that predicts joint structures of proteins, nucleic acids, ligands, ions and modified residues, and that improves on AlphaFold-Multimer v2.3 on, among other things, antibody-antigen interfaces.[2]
How it works
AlphaFold-Multimer keeps the high-level AlphaFold 2 stack (input embedder, Evoformer trunk, structure module, recycling) and adapts it to handle a fixed number of chains with known stoichiometry. The key technical changes documented in the preprint are concentrated in three areas: feature construction, the structure module loss, and training data sampling.[1]
Multi-chain feature construction
The input to AlphaFold 2 is dominated by an MSA representation and a pair representation indexed by residue. For a complex, AlphaFold-Multimer constructs each chain's MSA independently using Jackhmmer against UniRef90 and MGnify and HHBlits against BFD and Uniclust30, and additionally runs a Jackhmmer search against UniProt to obtain a per-chain alignment whose rows carry organism identifiers.[1] These per-chain alignments are then assembled into the multimer MSA in two parts:
- Paired MSA. Sequences from different chains are paired row-by-row when they originate from the same species (i.e. share the same taxonomy identifier in UniProt). This yields rows in which the entries across chains are likely to be interacting orthologs, supplying the inter-chain coevolutionary signal that single-chain MSAs cannot.[1][11]
- Unpaired (block-diagonal) MSA. The remaining per-chain sequences are stacked block-diagonally so that each chain contributes its full single-chain alignment in its own columns, with gaps in the columns of the other chains.[11]
The pair representation is initialised from a residue index that is reset at the start of each chain, replacing the linker-hack convention with a proper "chain break" feature. Templates and atom positions are likewise concatenated chain by chain with chain identifiers in the input features.[1][11]
Chain-permutation-invariant loss
For homomers (and, more generally, any complex containing multiple copies of the same sequence) the assignment of predicted coordinates to ground-truth chains is not unique: the ground-truth structure is invariant under any permutation of identical chains. Naively penalising mean coordinate error against a fixed labelling would punish the model for getting the structure right but with chains labelled in a different order.
AlphaFold-Multimer addresses this with a chain-permutation-invariant structure loss. During training, for every loss term that depends on chain identity (FAPE, distogram, etc.), the code enumerates valid permutations of identical chains, computes the loss under each, and takes the minimum.[1] This symmetrisation over symmetry-equivalent labellings prevents gradient signal from being dominated by an arbitrary chain ordering and is, according to the authors, important for stable homomer training.[1]
Training data and curation
The model is trained on multi-chain assemblies from the Protein Data Bank with a resolution cutoff at 9 angstrom for inputs and 3 angstrom for the training labels, clustered by sequence identity to avoid leakage. Self-distillation uses AlphaFold-predicted single-chain structures from the AlphaFold DB as additional supervision for monomers.[1] At training time the system samples cropped sub-assemblies (the "interface crop") to ensure that interface contacts are well represented in batches, and it uses recycling exactly as in AlphaFold 2.[1]
Confidence metrics
AlphaFold-Multimer keeps the pLDDT and PAE outputs of AlphaFold 2 and adds an interface predicted TM-score (ipTM) that summarises confidence specifically in the relative positions of the chains forming the interface. The model ranking score reported in the codebase is a weighted combination, model confidence = 0.8 ipTM + 0.2 pTM, which biases ranking toward interface accuracy. The European Bioinformatics Institute notes that ipTM values above 0.8 typically indicate confident, high-quality predictions, that values below 0.6 usually correspond to failed predictions, and that the 0.6 to 0.8 range is a grey zone.[12]
Evaluation
Heterodimer benchmark
The preprint reports a curated heterodimer benchmark of 4,433 PDB heterodimers split with a 30 April 2018 training cutoff and resolution and clustering filters. Performance is measured with DockQ, where DockQ above 0.23 counts as "acceptable" (CAPRI definition), above 0.49 as medium and above 0.8 as high.[1] On this benchmark AlphaFold-Multimer reaches DockQ above 0.23 on about 67 percent of pairs and DockQ above 0.8 on about 23 percent, versus roughly 23 percent and 12 percent for AlphaFold 2 used with the residue-index-jump trick on the same inputs.[1] The v2 update of March 2022 reports the corresponding success rates rising to about 70 percent at DockQ above 0.23 and 26 percent at DockQ above 0.8.[3]
| Setting | DockQ greater than 0.23 | DockQ greater than 0.8 | Source |
|---|---|---|---|
| AlphaFold 2 (chain-merged residue jump) | ~23% | ~12% | Evans et al., v1[1] |
| AlphaFold-Multimer v1 (Oct 2021) | ~67% | ~23% | Evans et al., v1[1] |
| AlphaFold-Multimer v2 (Mar 2022) | ~70% | ~26% | Evans et al., v2[3] |
Homomer benchmark
For homomers, the preprint reports that AlphaFold-Multimer also significantly improves on single-chain AlphaFold combined with a flexible linker, with success measured by a TM-score over the full assembly. The chain-permutation-invariant loss is described as essential for stable training on these symmetric targets.[1]
CASP14 multimer category and CAPRI
The v1 paper retroactively evaluated AlphaFold-Multimer on the CASP14 multimer category and reported a clear lead over the next-best CASP14 group, although no DeepMind entry had been formally submitted in that category at the time of the competition.[1] In 2022 the system was the de facto baseline for CASP15-CAPRI Round 54, the fifth joint CASP-CAPRI protein assembly challenge: of 37 targets (14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies), high-quality models were produced for about 40 percent of targets, with top groups extending AlphaFold-Multimer via expanded sampling, MSA enrichment or external scoring.[10] Antibody-antigen and conformationally flexible targets remained the hardest categories.[10]
Variants and downstream uses
Weight versions distributed by DeepMind
The google-deepmind/alphafold repository ships five multimer model checkpoints per generation. The current default is Multimer v3, which is what scripts/download_all_data.sh retrieves; earlier v1 and v2 weights remain available through alternative URLs but are marked deprecated.[5][6]
| Weights | Released with | Date | Notable changes |
|---|---|---|---|
| Multimer v1 | AlphaFold v2.1.0 | January 2022 | Initial release accompanying the v1 preprint[4] |
| Multimer v2 | AlphaFold v2.2.0 | 10 March 2022 | Fewer clashes, multiple-seed support[8] |
| Multimer v3 | AlphaFold v2.3.0 | 13 December 2022 | New 2021-09-30 training cutoff, crops 640, up to 20 chains, MSA up to 2,048[5] |
ColabFold's "AlphaFold2 multimer" mode
ColabFold, by Mirdita, Ovchinnikov, Steinegger and colleagues (published in Nature Methods in May 2022), made AlphaFold and AlphaFold-Multimer broadly accessible by replacing the slow Jackhmmer-based MSA pipeline with an MMseqs2 search against ColabFoldDB. ColabFold exposes two complex-prediction modes: an "AlphaFold2_ptm with residue index jump" mode for ad-hoc multimer prediction, and an "AlphaFold2-multimer" mode that loads the DeepMind multimer weights and uses a species-paired MSA assembled from the MMseqs2 results.[9] ColabFold's authors report that this combination matches AlphaFold-Multimer's accuracy on the ClusPro dataset while running orders of magnitude faster end-to-end.[9]
Third-party implementations and forks
- OpenFold reimplements AlphaFold 2 and AlphaFold-Multimer in PyTorch and provides retrainable multimer checkpoints under permissive licensing.[6]
- Numerous research groups distributed sampling-extended variants of AlphaFold-Multimer (such as AFsample, with massive sampling, and AFProfile, which back-propagates through the MSA profile to denoise it); AFProfile, for example, lifted the success rate on 487 difficult complexes from 13 percent to 33 percent (MMscore above 0.75).[13]
Successor models
- AlphaFold 3 (2024): the diffusion-based AlphaFold 3 from DeepMind and Isomorphic Labs predicts joint structures of complexes including proteins, nucleic acids, small molecules, ions and modified residues, and reports substantially higher antibody-antigen accuracy than AlphaFold-Multimer v2.3 alongside large gains on protein-ligand and protein-nucleic-acid interactions.[2]
- Boltz and Chai-1 are open-source biomolecular structure-prediction systems released in 2024 that take design inspiration from AlphaFold 3 rather than AlphaFold-Multimer, but they are routinely benchmarked against AlphaFold-Multimer v2.3 as the prior state of the art for protein-only complexes.[2]
Applications
AlphaFold-Multimer rapidly became standard infrastructure for problems where pairwise or higher-order protein-protein interactions matter. Reported uses include:
- Mapping protein-protein interaction networks. Researchers have used AlphaFold-Multimer with high-throughput pairing to enumerate plausible interactions across an organism's proteome, ranking candidates by ipTM and PAE at the interface.[10]
- Structural characterisation of intrinsically disordered regions at interfaces, where AlphaFold-Multimer has been shown to recover known binding modes despite the absence of an experimental template.[14]
- Antibody and nanobody modelling, with the well-documented caveat that antibody-antigen complexes remain among the hardest targets for the system.[15]
- AI Drug Discovery pipelines, where protein-protein interface models are used to prioritise targets and to provide starting points for further refinement and free-energy calculations.[2]
Limitations
Several systematic weaknesses have been reported in independent benchmarks:
- Antibody-antigen interfaces. External evaluations on recent antibody-antigen structures report success rates around 11 percent for early AlphaFold-Multimer versions and roughly 20 to 30 percent for v2.3, rising to about 50 percent with aggressive sampling. The probable cause is the lack of useful inter-chain coevolution signal in the paired MSA, because antibody V regions arise through somatic hypermutation rather than vertical descent.[15]
- Conformational flexibility. AlphaFold-Multimer returns single static conformations; CASP15-CAPRI noted that flexible or multi-state complexes remained difficult.[10]
- Stoichiometry must be supplied. Unlike AlphaFold 3's server interface, AlphaFold-Multimer requires the user to specify which sequences and how many copies are present; the model does not infer stoichiometry from sequence alone.[1]
- Confidence calibration is imperfect. Several follow-up studies, including AFProfile, identified "hidden failures" in which ipTM is high but the structure is wrong, urging caution when using ipTM alone as a screening filter.[13]
- Hallucinated interfaces. Particularly on hard cases lacking coevolution signal, the model can confidently predict spurious contacts, motivating downstream integration with biophysical or experimental validation.[15]
Related work
- AlphaFold: the underlying monomer system whose architecture AlphaFold-Multimer extends.
- AlphaFold 3: the diffusion-based 2024 successor that supersedes AlphaFold-Multimer on most reported metrics.[2]
- RoseTTAFold: contemporaneous three-track architecture with its own complex-prediction variant (RoseTTAFold2 and RoseTTAFold All-Atom).
- ESMFold: language-model-based, MSA-free monomer predictor; less competitive for complexes because it does not exploit paired-MSA signal.
- OpenFold: open PyTorch reimplementation of AlphaFold and AlphaFold-Multimer.
- ColabFold: MMseqs2-accelerated front-end that bundles the multimer weights.
- Boltz and Chai-1: 2024 open implementations of AlphaFold-3-style biomolecular structure prediction, benchmarked against AlphaFold-Multimer.
- RFdiffusion: diffusion model for protein design that is often paired with AlphaFold-Multimer for in silico validation of designed binders.
See also
- AlphaFold
- AlphaFold 3
- RoseTTAFold
- ESMFold
- OpenFold
- ColabFold
- Boltz
- Chai-1
- RFdiffusion
- Protein folding
- Demis Hassabis
- Google DeepMind
- Isomorphic Labs
- AI Drug Discovery
References
- Richard Evans, Michael O'Neill, Alexander Pritzel, Natasha Antropova, Andrew Senior, Tim Green, Augustin Zidek, Russ Bates, Sam Blackwell, Jason Yim, Olaf Ronneberger, Sebastian Bodenstein, Michal Zielinski, Alex Bridgland, Anna Potapenko, Andrew Cowie, Kathryn Tunyasuvunakool, Rishub Jain, Ellen Clancy, Pushmeet Kohli, John Jumper, Demis Hassabis, "Protein complex prediction with AlphaFold-Multimer", bioRxiv (preprint), 2021-10-04. https://www.biorxiv.org/content/10.1101/2021.10.04.463034v1. Accessed 2026-05-21. ↩
- Josh Abramson et al., "Accurate structure prediction of biomolecular interactions with AlphaFold 3", Nature, 2024-05-08. https://www.nature.com/articles/s41586-024-07487-w. Accessed 2026-05-21. ↩
- Richard Evans et al., "Protein complex prediction with AlphaFold-Multimer (v2)", bioRxiv (updated preprint), 2022-03-10. https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2. Accessed 2026-05-21. ↩
- Google DeepMind, "AlphaFold v2.1.0 release notes", GitHub, 2022-01-19. https://github.com/google-deepmind/alphafold/releases/tag/v2.1.0. Accessed 2026-05-21. ↩
- Google DeepMind, "Technical note: AlphaFold v2.3.0", GitHub (docs/technical_note_v2.3.0.md), 2022-12-13. https://github.com/google-deepmind/alphafold/blob/main/docs/technical_note_v2.3.0.md. Accessed 2026-05-21. ↩
- Google DeepMind, "AlphaFold (open source code for AlphaFold 2 and AlphaFold-Multimer)", GitHub repository README. https://github.com/google-deepmind/alphafold. Accessed 2026-05-21. ↩
- John Jumper et al., "Highly accurate protein structure prediction with AlphaFold", Nature 596, 583-589, 2021-07-15. https://www.nature.com/articles/s41586-021-03819-2. Accessed 2026-05-21. ↩
- Google DeepMind, "Releases - google-deepmind/alphafold (v2.2.0, v2.3.0, v2.3.2)", GitHub. https://github.com/google-deepmind/alphafold/releases. Accessed 2026-05-21. ↩
- Milot Mirdita, Konstantin Schutze, Yoshitaka Moriwaki, Lim Heo, Sergey Ovchinnikov, Martin Steinegger, "ColabFold: making protein folding accessible to all", Nature Methods 19, 679-682, 2022-05-30. https://www.nature.com/articles/s41592-022-01488-1. Accessed 2026-05-21. ↩
- Marc F. Lensink et al., "Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment", Proteins: Structure, Function, and Bioinformatics, 2023-12. https://pmc.ncbi.nlm.nih.gov/articles/PMC10841881/. Accessed 2026-05-21. ↩
- "MSA Processing and Pairing in AlphaFold-Multimer", DeepWiki entry for sokrypton/alphafold. https://deepwiki.com/sokrypton/alphafold/5.2-msa-processing-and-pairing. Accessed 2026-05-21. ↩
- European Bioinformatics Institute, "Confidence scores in AlphaFold-Multimer", EBI Training (AlphaFold course). https://www.ebi.ac.uk/training/online/courses/alphafold/inputs-and-outputs/evaluating-alphafolds-predicted-structures-using-confidence-scores/confidence-scores-in-alphafold-multimer/. Accessed 2026-05-21. ↩
- Patrick Bryant, Frank Noe, "Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile", PLoS Computational Biology, 2024-07-25. https://pmc.ncbi.nlm.nih.gov/articles/PMC11302914/. Accessed 2026-05-21. ↩
- "AlphaFold-Multimer accurately captures interactions and dynamics of intrinsically disordered protein regions", PNAS, 2024. https://pmc.ncbi.nlm.nih.gov/articles/PMC11536093/. Accessed 2026-05-21. ↩
- Rui Yin et al., "Evaluation of AlphaFold antibody-antigen modeling with implications for improving predictive accuracy", Protein Science, 2024. https://pmc.ncbi.nlm.nih.gov/articles/PMC10751731/. Accessed 2026-05-21. ↩
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.