Chai-2
Last reviewed
May 31, 2026
Sources
9 citations
Review status
Source-backed
Revision
v4 · 2,372 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 31, 2026
Sources
9 citations
Review status
Source-backed
Revision
v4 · 2,372 words
Add missing citations, update stale details, or suggest a clearer explanation.
Chai-2 is a generative artificial intelligence model from Chai Discovery for designing antibodies and small protein binders from scratch. Chai Discovery announced it on June 30, 2025, and posted a technical report titled "Zero-shot antibody design in a 24-well plate" to bioRxiv a few days later, on July 6, 2025. It is a successor to the company's earlier Chai-1 model. Where Chai-1 predicts the three-dimensional shape of molecules that already exist, Chai-2 tries to invent new molecules that bind to a chosen target. Chai Discovery reported that the model produced working binders for a wide range of targets at experimental success rates far above earlier computational methods, and that the designs were checked in a physical laboratory rather than only in simulation [1][2].
The model sits at the meeting point of two fields that grew up separately. One is computational protein structure prediction, the area that AlphaFold made famous. The other is drug discovery, specifically the slow and expensive work of finding an antibody or other molecule that grabs onto a disease-related protein. Chai-2 is an attempt to turn the first into a practical tool for the second [1].
The shift from Chai-1 to Chai-2 is a shift in the question being asked. Structure prediction takes a known sequence of amino acids and works out how the chain folds in space. It answers "what does this molecule look like." Design runs in the other direction. It starts from a target, a protein you want to bind, and tries to produce a brand new molecule whose shape and chemistry let it stick to a specific spot on that target. It answers "what molecule should I make" [1].
That reversal is harder than it sounds. The number of possible protein sequences is astronomically large, and almost none of them fold into anything useful or bind to anything in particular. Earlier folding models gave researchers a way to check a candidate after the fact, but they did not generate good candidates on their own. Chai-2 is built to generate them directly. Chai Discovery describes it as a multimodal generative model, meaning it works with sequence and three-dimensional structure together rather than treating them as separate problems [1][2].
The term that comes up repeatedly in the company's description is zero-shot. Chai-2 is meant to design a binder for a new target without having seen any existing binder for that target and without target-specific experimental data to learn from. The model draws on what it learned about protein folding and molecular interaction in general, then applies that knowledge to a target it was not specifically trained on [1].
The public technical report does not open up the full architecture, but it lays out the workflow clearly enough to follow [1].
The process starts with generation. Given a target protein and, optionally, a specific region of that protein to aim at, Chai-2 proposes new binder designs as full molecules, sequence and atomic structure at once. For antibodies this includes the variable regions that do the actual binding. The company reports that the model can design two common antibody formats, single-domain antibodies known as VHHs or nanobodies, and scFv fragments, as well as small de novo proteins that are not antibodies at all [1].
The second stage is filtering in software. The model and related structure-prediction tools fold and score each proposed design, estimating how confidently it would adopt the intended shape and bind the intended spot. This is where the lineage from Chai-1 matters, since reliable folding and scoring let the system rank a large pool of generated ideas and keep only the most promising ones [1][2].
The third stage is the wet lab. Here the approach is unusual. Rather than synthesizing thousands or millions of candidates and screening them in bulk, Chai Discovery says it tested 20 or fewer designs per target, laid out in an ordinary 24-well plate, then measured whether each one actually bound. A design counts as a hit if it shows detectable binding to the target in the assay, and the company reports running the full loop from design to lab readout in under two weeks [1][2]. Testing so few candidates per target is only worthwhile if a good fraction of them work, which is the whole claim the model is trying to support.
The headline figure from Chai Discovery is an average de novo antibody hit rate of about 16 percent, reported as 15.5 percent across a benchmark of 52 distinct targets, none of which had a known antibody or nanobody binder in the structural antibody database the team used. The company also reports finding at least one binder for 26 of those 52 targets, roughly half, in a single round of testing [1][2]. The numbers below are as the company states them.
| Reported metric | Value as stated by Chai Discovery |
|---|---|
| Average antibody hit rate | About 16 percent, reported as 15.5 percent across the benchmark |
| Hit rate by format | About 20 percent for VHH single-domain, about 14 percent for scFv |
| Targets in the benchmark | 52 distinct targets with no known binder in the database used |
| Targets with at least one binder | 26 of 52, about half, in one round |
| Designs tested per target | 20 or fewer |
| Lab format | Single 24-well plate, low throughput |
| Antibody formats designed | VHH single-domain and scFv |
| Improvement over prior de novo methods | More than 100-fold |
| Reported binder affinities | Into the nanomolar and picomolar range for some designs |
| Miniprotein binders | About 68 percent hit rate across 5 targets |
The 100-fold framing is worth unpacking, since it is the comparison that makes the result notable. Earlier zero-shot computational methods for designing antibodies typically landed below a 0.1 percent hit rate, so almost nothing they produced bound at all. Moving the typical success rate into the double digits is the change Chai Discovery is pointing to [1][2]. The company also reports that the model can be steered toward a chosen spot on the target rather than just binding somewhere, and that it produced hits for targets that have been hard for in silico design in the past, including TNF-alpha, a protein tied to autoimmune disease [1][3].
Chai-2 is not limited to antibodies. The same system designs small de novo proteins, sometimes called miniprotein binders, and here the company reports an even higher hit rate, about 68 percent across 5 targets, with some binders reaching picomolar affinity [1][2]. That the approach works for both antibodies and non-antibody binders is part of why Chai Discovery describes it as a general design model rather than an antibody-only tool [1].
Finding an antibody against a new target is one of the rate-limiting steps in modern biologics. The traditional routes, immunizing animals or screening enormous libraries of random candidates, are slow, can take months, and do not always cover the exact part of a protein a drug program cares about. A method that proposes a small handful of designs and lands real binders for many targets would compress part of that timeline and let teams aim at specific sites by choice rather than by luck [1][3].
There is a second, quieter point. The number of candidates tested per target is tiny by industry standards. If a 24-well plate is enough to find a binder, the cost and infrastructure needed to start an antibody campaign drop a lot, which could matter most for targets that have been considered too hard or too small a market to justify a full screening effort [1].
That said, binding is the beginning of a drug, not the end of one. A molecule that sticks to its target still has to be developable, meaning it can be manufactured, stays stable, does not clump, and behaves in the body. It has to do something useful, such as blocking or activating the target, and it has to clear safety and efficacy testing. Chai Discovery's reported results are about binding measured in an assay, and the company itself frames a binder as an early step that still needs optimization and validation [1]. The honest reading is that Chai-2 attacks one expensive bottleneck rather than the whole pipeline.
Chai-2 is more closed than several of its peers. Chai-1 had been offered fairly openly, with a web interface and code available for non-commercial research. Chai-2, by contrast, is described as available to partners through Chai Discovery's platform rather than as open weights or a free public download [1][2]. This is a different posture from the academic protein-design world, where code and weights are often released, and it reflects the commercial value the company places on a working design engine [6]. Chai Discovery is a startup founded in 2024 by Joshua Meier, who was chief AI officer at the biotech firm Absci and earlier worked on protein language models at Meta AI, along with Jack Dent, formerly an engineer at Stripe, and AI researchers Matthew McPartlon and Jacques Boitreaud. The company raised a 30 million dollar seed round in 2024 at a roughly 150 million dollar valuation, with investors including OpenAI and Thrive Capital, and went on to raise larger rounds in 2025 [4][5].
Chai-2 belongs to a small group of systems that use learned models of protein structure to design new molecules, and it helps to place it against the better-known names.
| System | Group | Main focus | Notes |
|---|---|---|---|
| Chai-2 | Chai Discovery | De novo antibodies and miniprotein binders | Generative design plus folding and screening, controlled access |
| Chai-1 | Chai Discovery | Structure prediction | The folding model Chai-2 builds on |
| AlphaProteo | Google DeepMind | De novo protein binders | Reported binder hit rates of roughly 9 to 88 percent across targets, focused on miniproteins rather than antibodies |
| RFdiffusion | Baker lab, University of Washington | De novo protein design via diffusion | Widely used and openly released, often paired with separate sequence design |
| AlphaFold | Google DeepMind | Structure prediction | Set the modern baseline for folding accuracy |
The closest comparison on capability is AlphaProteo from DeepMind, which also designs protein binders and reported wet-lab hit rates ranging from about 9 percent to 88 percent depending on the target [7]. The important distinction is that AlphaProteo concentrated on miniprotein binders, while Chai-2's central claim is about antibodies, which are generally treated as the harder design problem because of their larger and more flexible binding loops. On the methods side, RFdiffusion from the Baker lab is the best-known open approach to generating protein structures, and it differs both in technique, since it uses a diffusion process, and in spirit, since its code and weights are public [8]. Across all of these, the common ancestor is the wave of structure-prediction work that AlphaFold and tools like it set off, which gave the field accurate enough folding to make generative design practical [9].
It is fair to call Chai-2 part of the broader move toward a foundation model style of biology, where a single large model trained on protein data is adapted to many tasks. The design framing is what separates it from a pure folding model.
Several cautions apply, and most of them come down to the gap between a number in a report and a molecule in the clinic.
First, the reported figures come from the company that built the model and had not, as of mid-2025, been reproduced widely by independent groups. A roughly 16 percent hit rate across 52 targets is a strong claim, and the natural next step is external replication on targets the model has not seen. Outside readers have also asked how novel the chosen targets really were and whether the selected epitopes leaned toward easier sites, questions that open data and external testing would settle [1][3].
Second, a hit means detectable binding, not a finished drug. The reported affinities span a range, and many early designs would likely need affinity maturation and engineering before they are useful. Developability and function are separate hurdles that binding alone does not clear, a point the company makes itself [1].
Third, the closed access model limits scrutiny. When weights and code are not public, outside researchers cannot probe failure modes, test edge cases, or confirm that the headline numbers hold on their own targets. This is a reasonable commercial choice, but it puts more weight on the company's own reporting [1].
The steady thread through all of this is that experimental validation does the real work. A generative model can propose endless plausible-looking molecules, and the only thing that settles whether a design binds is making it and measuring it. Chai-2's central argument is not just that it generates designs but that a large share of the few it tested actually held up in the lab [1]. Whether that result generalizes to new targets, new groups, and the long road from binder to therapy is the question the field will keep asking.