AlphaFold-Multimer
Last reviewed
May 21, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 2,671 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 21, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 2,671 words
Add missing citations, update stale details, or suggest a clearer explanation.
AlphaFold-Multimer is a deep learning system for predicting the three-dimensional structures of protein complexes, released by Google DeepMind in October 2021 as an extension of AlphaFold 2. It modifies the AlphaFold 2 architecture and training procedure to handle multi-chain inputs of known stoichiometry, introducing chain-aware feature construction, paired multiple-sequence alignments (MSAs), and a chain-permutation-invariant loss. On a curated heterodimer benchmark, AlphaFold-Multimer raised the rate of acceptable interface predictions (DockQ greater than 0.23) from roughly 23 percent for AlphaFold 2 with a residue-index-jump trick to about 67 percent.[^1] The system became the de facto open-source tool for protein-protein complex prediction between late 2021 and the release of AlphaFold 3 in 2024, which superseded it on most benchmarks.[^2]
| Attribute | Value |
|---|---|
| Developer | Google DeepMind[^1] |
| First preprint | bioRxiv 2021-10-04, doi:10.1101/2021.10.04.463034[^1] |
| Updated preprint | 2022-03-10 (v2)[^3] |
| First public weights | AlphaFold v2.1.0, January 2022[^4] |
| Current weights series | Multimer v3 (in AlphaFold v2.3.0, December 2022)[^5] |
| Code repository | github.com/google-deepmind/alphafold[^6] |
| Code license | Apache 2.0[^6] |
| Parameter license | CC BY 4.0[^6] |
| Superseded by | AlphaFold 3 (Abramson et al., Nature 2024)[^2] |
AlphaFold 2, presented at the 14th Critical Assessment of Structure Prediction (CASP14) in late 2020 and published in Nature in July 2021, set a new standard for single-chain protein structure prediction. The system used a novel Evoformer block operating jointly on an MSA representation and a pair representation, followed by an SE(3)-equivariant structure module that produced full atomic coordinates.[^7] Although AlphaFold 2 was trained on single-chain crops, several groups noted that concatenating two chains with a residue-index gap (the "linker hack" or "chain-merged" trick) sometimes recovered reasonable complex models, prompting a series of community adaptations.[^1]
DeepMind formalised multi-chain prediction with the preprint "Protein complex prediction with AlphaFold-Multimer," posted to bioRxiv on 4 October 2021 (doi:10.1101/2021.10.04.463034) by Richard Evans, Michael O'Neill, Alexander Pritzel, Natasha Antropova, Andrew Senior, Tim Green, Augustin Zidek, Russ Bates, Sam Blackwell, Jason Yim, Olaf Ronneberger, Sebastian Bodenstein, Michal Zielinski, Alex Bridgland, Anna Potapenko, Andrew Cowie, Kathryn Tunyasuvunakool, Rishub Jain, Ellen Clancy, Pushmeet Kohli, John Jumper and Demis Hassabis.[^1] An updated version (v2) was posted on 10 March 2022 alongside the v2.2 weights release.[^3]
Code and the first set of multimer weights shipped in the google-deepmind/alphafold repository with the v2.1.0 release in January 2022.[^4][^6] DeepMind subsequently published two weight refreshes:
num_recycle=0.[^8]A patch release v2.3.2 in April 2024 added quality-of-life flags but did not change the multimer weights.[^8]
Through 2022 and 2023 AlphaFold-Multimer was integrated into community tooling (notably ColabFold, which packaged the multimer weights with MMseqs2-based MSAs[^9]), and it dominated the CASP15-CAPRI Round 54 assembly challenge in 2022, where almost every top group built on the multimer weights and high-quality models were produced for about 40 percent of targets versus 8 percent two years earlier.[^10] In May 2024 DeepMind and Isomorphic Labs published AlphaFold 3 in Nature, a diffusion-based successor that predicts joint structures of proteins, nucleic acids, ligands, ions and modified residues, and that improves on AlphaFold-Multimer v2.3 on, among other things, antibody-antigen interfaces.[^2]
AlphaFold-Multimer keeps the high-level AlphaFold 2 stack (input embedder, Evoformer trunk, structure module, recycling) and adapts it to handle a fixed number of chains with known stoichiometry. The key technical changes documented in the preprint are concentrated in three areas: feature construction, the structure module loss, and training data sampling.[^1]
The input to AlphaFold 2 is dominated by an MSA representation and a pair representation indexed by residue. For a complex, AlphaFold-Multimer constructs each chain's MSA independently using Jackhmmer against UniRef90 and MGnify and HHBlits against BFD and Uniclust30, and additionally runs a Jackhmmer search against UniProt to obtain a per-chain alignment whose rows carry organism identifiers.[^1] These per-chain alignments are then assembled into the multimer MSA in two parts:
The pair representation is initialised from a residue index that is reset at the start of each chain, replacing the linker-hack convention with a proper "chain break" feature. Templates and atom positions are likewise concatenated chain by chain with chain identifiers in the input features.[^1][^11]
For homomers (and, more generally, any complex containing multiple copies of the same sequence) the assignment of predicted coordinates to ground-truth chains is not unique: the ground-truth structure is invariant under any permutation of identical chains. Naively penalising mean coordinate error against a fixed labelling would punish the model for getting the structure right but with chains labelled in a different order.
AlphaFold-Multimer addresses this with a chain-permutation-invariant structure loss. During training, for every loss term that depends on chain identity (FAPE, distogram, etc.), the code enumerates valid permutations of identical chains, computes the loss under each, and takes the minimum.[^1] This symmetrisation over symmetry-equivalent labellings prevents gradient signal from being dominated by an arbitrary chain ordering and is, according to the authors, important for stable homomer training.[^1]
The model is trained on multi-chain assemblies from the Protein Data Bank with a resolution cutoff at 9 angstrom for inputs and 3 angstrom for the training labels, clustered by sequence identity to avoid leakage. Self-distillation uses AlphaFold-predicted single-chain structures from the AlphaFold DB as additional supervision for monomers.[^1] At training time the system samples cropped sub-assemblies (the "interface crop") to ensure that interface contacts are well represented in batches, and it uses recycling exactly as in AlphaFold 2.[^1]
AlphaFold-Multimer keeps the pLDDT and PAE outputs of AlphaFold 2 and adds an interface predicted TM-score (ipTM) that summarises confidence specifically in the relative positions of the chains forming the interface. The model ranking score reported in the codebase is a weighted combination, model confidence = 0.8 ipTM + 0.2 pTM, which biases ranking toward interface accuracy. The European Bioinformatics Institute notes that ipTM values above 0.8 typically indicate confident, high-quality predictions, that values below 0.6 usually correspond to failed predictions, and that the 0.6 to 0.8 range is a grey zone.[^12]
The preprint reports a curated heterodimer benchmark of 4,433 PDB heterodimers split with a 30 April 2018 training cutoff and resolution and clustering filters. Performance is measured with DockQ, where DockQ above 0.23 counts as "acceptable" (CAPRI definition), above 0.49 as medium and above 0.8 as high.[^1] On this benchmark AlphaFold-Multimer reaches DockQ above 0.23 on about 67 percent of pairs and DockQ above 0.8 on about 23 percent, versus roughly 23 percent and 12 percent for AlphaFold 2 used with the residue-index-jump trick on the same inputs.[^1] The v2 update of March 2022 reports the corresponding success rates rising to about 70 percent at DockQ above 0.23 and 26 percent at DockQ above 0.8.[^3]
| Setting | DockQ greater than 0.23 | DockQ greater than 0.8 | Source |
|---|---|---|---|
| AlphaFold 2 (chain-merged residue jump) | ~23% | ~12% | Evans et al., v1[^1] |
| AlphaFold-Multimer v1 (Oct 2021) | ~67% | ~23% | Evans et al., v1[^1] |
| AlphaFold-Multimer v2 (Mar 2022) | ~70% | ~26% | Evans et al., v2[^3] |
For homomers, the preprint reports that AlphaFold-Multimer also significantly improves on single-chain AlphaFold combined with a flexible linker, with success measured by a TM-score over the full assembly. The chain-permutation-invariant loss is described as essential for stable training on these symmetric targets.[^1]
The v1 paper retroactively evaluated AlphaFold-Multimer on the CASP14 multimer category and reported a clear lead over the next-best CASP14 group, although no DeepMind entry had been formally submitted in that category at the time of the competition.[^1] In 2022 the system was the de facto baseline for CASP15-CAPRI Round 54, the fifth joint CASP-CAPRI protein assembly challenge: of 37 targets (14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies), high-quality models were produced for about 40 percent of targets, with top groups extending AlphaFold-Multimer via expanded sampling, MSA enrichment or external scoring.[^10] Antibody-antigen and conformationally flexible targets remained the hardest categories.[^10]
The google-deepmind/alphafold repository ships five multimer model checkpoints per generation. The current default is Multimer v3, which is what scripts/download_all_data.sh retrieves; earlier v1 and v2 weights remain available through alternative URLs but are marked deprecated.[^5][^6]
| Weights | Released with | Date | Notable changes |
|---|---|---|---|
| Multimer v1 | AlphaFold v2.1.0 | January 2022 | Initial release accompanying the v1 preprint[^4] |
| Multimer v2 | AlphaFold v2.2.0 | 10 March 2022 | Fewer clashes, multiple-seed support[^8] |
| Multimer v3 | AlphaFold v2.3.0 | 13 December 2022 | New 2021-09-30 training cutoff, crops 640, up to 20 chains, MSA up to 2,048[^5] |
ColabFold, by Mirdita, Ovchinnikov, Steinegger and colleagues (published in Nature Methods in May 2022), made AlphaFold and AlphaFold-Multimer broadly accessible by replacing the slow Jackhmmer-based MSA pipeline with an MMseqs2 search against ColabFoldDB. ColabFold exposes two complex-prediction modes: an "AlphaFold2_ptm with residue index jump" mode for ad-hoc multimer prediction, and an "AlphaFold2-multimer" mode that loads the DeepMind multimer weights and uses a species-paired MSA assembled from the MMseqs2 results.[^9] ColabFold's authors report that this combination matches AlphaFold-Multimer's accuracy on the ClusPro dataset while running orders of magnitude faster end-to-end.[^9]
AlphaFold-Multimer rapidly became standard infrastructure for problems where pairwise or higher-order protein-protein interactions matter. Reported uses include:
Several systematic weaknesses have been reported in independent benchmarks: