EvolutionaryScale
Last reviewed
May 16, 2026
Sources
15 citations
Review status
Source-backed
Revision
v1 ยท 3,807 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 16, 2026
Sources
15 citations
Review status
Source-backed
Revision
v1 ยท 3,807 words
Add missing citations, update stale details, or suggest a clearer explanation.
EvolutionaryScale was an American artificial intelligence company that built frontier generative models for biology, most notably the ESM3 protein language model. Founded in mid 2023 by a group of researchers who had worked together at the Fundamental AI Research lab at Meta AI, the company was organized as a public benefit corporation and emerged from stealth in June 2024 with one of the largest seed financings in the history of AI biotechnology. EvolutionaryScale shipped the ESM3 family, the smaller ESM Cambrian family, and a developer platform called Forge that exposed the models through an API and cloud microservices. In November 2025 the team and its technology were absorbed into the Chan Zuckerberg Biohub, ending EvolutionaryScale's roughly two and a half years as an independent startup while keeping the research program intact.
The company was widely seen as the closest commercial peer to Google DeepMind and Isomorphic Labs in the small group of frontier labs applying very large neural networks to biology. Where DeepMind's AlphaFold 3 pursued accurate structure and interaction prediction, EvolutionaryScale concentrated on generative protein design and on building a scalable protein language model family that biologists could prompt the same way software engineers prompt a code assistant. ESM3 was published in the journal Science in January 2025 and trained with more compute than any prior biological model, which gave EvolutionaryScale outsized influence on the broader field of AI drug discovery despite its short lifespan.
The scientific lineage of EvolutionaryScale runs back to the Evolutionary Scale Modeling, or ESM, project that Alexander Rives started inside Meta's Fundamental AI Research lab in 2019. Rives had completed his PhD at New York University, where his thesis work argued that transformer language models pretrained on protein sequences could implicitly learn the rules of folding, function, and evolution. After joining FAIR he led a small team that turned that thesis into a published research program. ESM1, released in 2019, was the first large transformer trained on tens of millions of protein sequences. ESM2 followed in 2022 with up to 15 billion parameters, and the team paired it with ESMFold, an end to end protein structure predictor that produced an atlas of more than 600 million predicted structures from metagenomic data.
In August 2023 Meta restructured its research portfolio as part of a broader efficiency push under Mark Zuckerberg, and the AI protein team was wound down. Roughly a dozen researchers, most of them members of the original ESM group, decided to keep the project alive outside Meta. Rives took the role of chief executive and chief scientist of the new entity, which was registered in New York as a Delaware public benefit corporation under the name EvolutionaryScale. Several reporters described the founding cohort as part of the larger "Meta AI mafia" that left FAIR during the same period to start companies.
The small founding team included Tom Sercu and Sal Candido, both former research engineers and managers on the ESM project at Meta. Other early hires came from FAIR's New York office and from academic labs that had collaborated with the ESM group, giving EvolutionaryScale an unusual concentration of people who had already shipped multiple generations of large protein models together. The company kept its headquarters in New York City and operated with a flat structure organized around a single science team, a smaller engineering team, and a product group that owned the Forge platform.
EvolutionaryScale spent its first nine months in stealth, building infrastructure and training the model that would become ESM3. The company emerged publicly on 25 June 2024 with the simultaneous announcement of ESM3 and a seed financing of more than 142 million United States dollars. The round was led by the technology investors Nat Friedman and Daniel Gross, who often co invest as the partnership NFDG, together with Lux Capital. Amazon and NVentures, the venture capital arm of NVIDIA, joined as strategic investors with a clear interest in placing EvolutionaryScale's models on their respective cloud and accelerator platforms. A long list of follow on investors and angels participated, including Khosla Ventures, Scale Venture Partners, Lead Edge Capital, Picus Capital, Tola Capital, Kepler Operators Fund, C2 Investment, Village Global, Brandon Deer, Sahil Lavingia, and Sam Altman.
The transaction was unusual for a seed round in several ways. The total amount was several times the typical seed for a deep technology company, the cap table mixed financial venture funds with two major cloud and hardware vendors, and the company had no revenue and no clinical assets at the time of close. Investors justified the size with the cost of training large biological foundation models on H100 class GPU clusters and with the expectation that EvolutionaryScale would need to build an enterprise platform, sales motion, and lab science capability in parallel.
| Date | Round | Amount | Lead investors | Notable participants |
|---|---|---|---|---|
| August 2023 | Founding capital | Undisclosed | Founders | Initial angel checks while in stealth |
| 25 June 2024 | Seed | More than 142 million USD | Nat Friedman, Daniel Gross, Lux Capital | Amazon, NVentures (NVIDIA), Khosla Ventures, Scale Venture Partners, Lead Edge Capital, Picus Capital, Tola Capital, Sam Altman, Sahil Lavingia |
| November 2025 | Acquisition by Chan Zuckerberg Biohub | Undisclosed | Chan Zuckerberg Initiative | Biohub Network |
The founding team carried over the working relationships and the engineering stack of the ESM group at Meta. Rives served as the public face of the company and the lead author on its scientific publications. Sercu and Candido handled engineering leadership and operations. A handful of senior researchers from the original FAIR team rounded out the science leadership.
| Name | Role at EvolutionaryScale | Prior background |
|---|---|---|
| Alexander Rives | Co founder, chief executive officer, chief scientist | Founder of the ESM project at Meta FAIR, PhD from New York University |
| Tom Sercu | Co founder, head of engineering | Research engineering manager at Meta FAIR, earlier at IBM Research on speech recognition |
| Sal Candido | Co founder, chief technology officer | Engineering leader at Meta FAIR, earlier engineering roles at Google |
| Halil Akin | Founding researcher | Research scientist at Meta FAIR on the ESM team |
| Roshan Rao | Founding researcher | Research scientist at Meta FAIR, PhD from the University of California, Berkeley on protein language models |
The team grew to roughly fifty people by 2025, divided across research, engineering, and a small biology and partnerships group. The company never opened a wet lab of its own, instead relying on contract research organizations and academic collaborators for in vitro and in vivo validation of generated proteins.
The scientific case for EvolutionaryScale rested on the bet that protein language models, which are decoder or encoder transformers trained on sequences of amino acids, would scale in the same way that text language models had scaled. ESM1 and ESM2 had already shown that the perplexity of a masked language model trained on protein sequences correlated with downstream performance on structure prediction, mutational effect prediction, and remote homology detection. ESM3 was designed to push the scaling argument by an order of magnitude and to add structural and functional modalities to the input and output of the model.
The core technical innovation of ESM3 was multimodal masked generative modeling over three tracks. The first track encoded the amino acid sequence of a protein. The second track encoded a tokenized representation of three dimensional structure, using a vector quantized variational autoencoder to compress atomic coordinates into a discrete vocabulary. The third track encoded function annotations such as Gene Ontology terms, InterPro family labels, and other categorical signals. Each track could be masked or revealed independently at inference time, which allowed a single network to perform structure prediction, sequence design conditioned on structure, function conditioned generation, and inpainting of arbitrary subsets of residues. The team described ESM3 as a frontier multimodal generative model that reasoned over sequence, structure, and function.
The largest ESM3 model contained 98 billion parameters and was trained on roughly 2.78 billion proteins drawn from UniRef, MGnify, and the Joint Genome Institute, supplemented with hundreds of millions of synthetic data points produced by predicted structures and function annotations. The total compute budget exceeded 1.07 times 10 to the 24th floating point operations, which EvolutionaryScale described as the most compute ever applied to training a biological model. Training was performed on the Andromeda cluster, which combined NVIDIA H100 Tensor Core GPUs with NVIDIA Quantum 2 InfiniBand networking. The company reported that ESM3 consumed roughly 25 times more floating point operations and 60 times more training data than ESM2.
To show that ESM3 could generate proteins with desired properties rather than only predict them, EvolutionaryScale ran an experiment in which the model was asked to design new green fluorescent proteins that differed substantially in sequence from any natural GFP. The team prompted the model with a chain of thought style protocol, generated 96 candidate sequences, expressed them in E. coli, and screened them for fluorescence in collaboration with academic partners. Several candidates fluoresced, and one of them, named esmGFP, differed by 96 mutations from its nearest natural relative and shared only 58 percent of its amino acid sequence with any known fluorescent protein. The team used a phylogenetic argument to estimate that the level of divergence between esmGFP and natural GFPs was equivalent to about 500 million years of natural evolution, which became the title and central headline of the subsequent paper.
The ESM3 work was first posted as a preprint on bioRxiv on 1 July 2024, the week after the company emerged from stealth. After more than six months of peer review the paper was published in the journal Science on 16 January 2025 under the title "Simulating 500 million years of evolution with a language model." The Science publication gave EvolutionaryScale a level of scientific legitimacy that few startups in the field had achieved at a comparable stage, and the paper was widely cited within months by groups working on protein engineering, antibody design, and biosecurity.
EvolutionaryScale released two main model families during its independent life. The ESM3 family covered controllable generation across sequence, structure, and function. The ESM Cambrian family, announced in December 2024, focused on representation learning rather than generation and was intended as a successor to ESM2 for embedding and downstream prediction tasks.
| Model | Release | Parameters | Modalities | Headline use case | Distribution |
|---|---|---|---|---|---|
| ESM1 | 2019 (Meta FAIR, pre EvolutionaryScale) | Up to 670 million | Sequence | First transformer protein language model | Open weights |
| ESM2 | 2022 (Meta FAIR, pre EvolutionaryScale) | Up to 15 billion | Sequence | Pretrained representations and ESMFold | Open weights |
| ESM3 small (open) | June 2024 | 1.4 billion | Sequence, structure, function | Open research and academic use | Open weights under a non commercial license |
| ESM3 medium | June 2024 | About 7 billion | Sequence, structure, function | Commercial inference via Forge | API on Forge and partner clouds |
| ESM3 large | June 2024 | 98 billion | Sequence, structure, function | Frontier generative design | API on Forge and partner clouds |
| ESM Cambrian 300M | December 2024 | 300 million | Sequence representation | Lightweight embeddings | Open weights |
| ESM Cambrian 600M | December 2024 | 600 million | Sequence representation | Mid range embedding tasks | Open weights |
| ESM Cambrian 6B | December 2024 | 6 billion | Sequence representation | Best in class protein embeddings | Forge for academics, AWS SageMaker for commercial users |
The open versions of the models were released through the company's GitHub organization and published to PyPI as the package named esm, where the version numbers reflected the new release cadence rather than the legacy ESM2 numbering. EvolutionaryScale used a tiered licensing approach in which small models were freely available for research, larger models were free for non commercial research but required a commercial agreement for use in drug discovery or other revenue generating settings, and the largest frontier model was offered only through paid cloud channels.
Forge was EvolutionaryScale's product. It bundled three things: a hosted API that exposed the full ESM3 and ESM Cambrian families behind a uniform interface, a set of first party web applications aimed at biologists who did not want to write code, and a library of Python tooling that was distributed alongside the open models. Forge handled tokenization of structure and function inputs, large batch inference, fine tuning of selected model heads, and integration with downstream tools for stability prediction and antibody analysis.
The company also distributed the larger ESM3 models through three external channels. Amazon Web Services hosted the model family on Amazon Bedrock, Amazon SageMaker, and AWS HealthOmics, which EvolutionaryScale claimed reached nine of the top ten global pharmaceutical companies through existing AWS relationships. NVIDIA packaged ESM3 as one of its NVIDIA NIM microservices and made an open version available on the NVIDIA BioNeMo platform for non commercial research. These two cloud channels were a direct consequence of the strategic investments by Amazon and NVentures during the seed round.
| Partner | Channel | Description |
|---|---|---|
| Amazon Web Services | Amazon Bedrock, Amazon SageMaker, AWS HealthOmics | Hosted ESM3 family for enterprise pharma users, with biological data services |
| NVIDIA | NIM microservices, BioNeMo platform | Runtime optimized ESM3 for NVIDIA accelerators, open version of ESM3 on BioNeMo for academic users |
| Hugging Face | Model Hub | Open ESM3 and ESM Cambrian weights for community download |
| GitHub | evolutionaryscale organization | Source code and reference implementations of the esm Python library |
| Academic partners | bioRxiv preprints and Science paper | Joint experimental validation of generated proteins and benchmarks for downstream tasks |
The company never disclosed a named multi year partnership with a specific large pharma company, in contrast with Isomorphic Labs, which signed publicly disclosed deals with Eli Lilly and Novartis. EvolutionaryScale's commercial model leaned on per usage Forge subscriptions, AWS marketplace consumption, and case by case licensing for drug discovery applications, with the option of revenue sharing on resulting therapeutics.
EvolutionaryScale operated in a small group of frontier labs that combined unusual levels of compute with a focus on biology. The most direct comparison was Google DeepMind and Isomorphic Labs, whose AlphaFold 3 model targeted the structural and interaction prediction problem and whose pipeline pointed toward in house drug development. EvolutionaryScale targeted protein design and embeddings, with a business model anchored on selling access to the models rather than building its own pipeline of clinical candidates. Several earlier startups, including Cradle in Amsterdam, Cyrus Biotechnology in Seattle, Generate Biomedicines in Cambridge, Massachusetts, and Inceptive in Palo Alto, were closer to EvolutionaryScale in spirit but worked with smaller models and more targeted use cases.
| Lab | Year founded | Main models or platform | Business model | Notable backers |
|---|---|---|---|---|
| EvolutionaryScale | 2023 | ESM3, ESM Cambrian, Forge | API access, cloud microservices, licensing | Nat Friedman, Daniel Gross, Lux Capital, Amazon, NVentures |
| Isomorphic Labs | 2021 | AlphaFold 3, IsoDDE drug design engine | Internal pipeline plus pharma partnerships | Alphabet, Thrive Capital, GV |
| Generate Biomedicines | 2018 | Chroma generative protein platform | Internal pipeline plus partnerships | Flagship Pioneering, Amgen, Nvidia |
| Cradle | 2021 | Diffusion based protein design platform | Subscription platform for biotech labs | Index Ventures, Kindred Capital |
| Cyrus Biotechnology | 2014 | Rosetta based protein design suite | Software subscriptions and services | Trinity Ventures, Springrock |
| Inceptive | 2021 | Biological foundation models for RNA medicines | Internal pipeline and biotech partnerships | Andreessen Horowitz, NVentures |
| Profluent Bio | 2022 | OpenCRISPR gene editing protein design | API access and licensing | Spark Capital, Insight Partners |
Within this group EvolutionaryScale stood out for the scale of its single largest model, the breadth of its modalities, and its decision to publish detailed model weights and a peer reviewed paper. Isomorphic Labs had a comparable scientific profile but did not release its frontier models openly, while Generate Biomedicines and Cradle operated more like traditional drug discovery firms with proprietary pipelines.
EvolutionaryScale was incorporated as a Delaware public benefit corporation, which obligated its directors to balance financial return with the public benefit purpose stated in its charter. The company described its mission as developing artificial intelligence to advance the understanding of biology for the benefit of human health and society. Rives spoke publicly about biosecurity considerations several times in 2024 and 2025 and noted that the company had implemented an internal review process for sensitive prompts, had restricted access to certain capabilities of the largest model behind the Forge API, and had consulted with United States government agencies on responsible release procedures.
The public benefit framing was unusual in the protein design space, where most peers were structured as conventional Delaware C corporations. Anthropic, the AI safety lab, had used a similar public benefit structure, and several investors highlighted the parallel when describing EvolutionaryScale's governance.
On 6 November 2025 the Chan Zuckerberg Initiative announced that it would direct the bulk of its philanthropic resources to the Chan Zuckerberg Biohub Network and that the EvolutionaryScale team would join Biohub through a strategic transaction. Financial terms were not disclosed. Rives took the new role of head of science at the Biohub Network, where he was given responsibility for scientific strategy across experimental biology, data, and artificial intelligence. The roughly fifty employees of EvolutionaryScale joined Biohub alongside him, and the company's models and infrastructure were absorbed into Biohub's research programs.
The transaction was widely interpreted as an acqui hire that ended EvolutionaryScale's life as a standalone commercial entity. Endpoints News and other industry outlets described it as the apparent end of the startup, although press releases from both sides used the language of a continuing scientific mission rather than a wind down. The deal coincided with a broader announcement by Mark Zuckerberg and Priscilla Chan that Biohub would become the primary vehicle for their philanthropy, with a focus on using artificial intelligence and biology to cure or prevent all disease by 2110. Several commentators noted the irony that the team Meta had let go in 2023 had now been reunited with capital from the founder of Meta, this time through the Chans' philanthropic vehicle rather than Meta itself.
The Forge API continued to operate through 2025 and into 2026 as the team integrated with Biohub. Open weights for the ESM3 small model and the ESM Cambrian 300M, 600M, and 6B variants remained available through GitHub and Hugging Face. Commercial pharmaceutical customers who had been using the larger ESM3 models through AWS or NVIDIA channels were directed to continue under existing agreements while Biohub considered the long term structure of the platform. The combined organization announced an intention to develop the next generation of biological foundation models as part of Biohub's nonprofit research mission, with a stated emphasis on open science and free academic access.
ESM3 received broad praise from machine learning and computational biology researchers when it appeared. Reviews highlighted the scaling result, the multimodal masked generation framework, and the wet lab validation of esmGFP as significant advances. The Science publication in January 2025 was the first peer reviewed publication of a frontier generative biological foundation model at this scale, and it became a frequent reference in subsequent papers on protein design with diffusion models, flow matching, and other generative techniques.
The company also drew attention for its commercial model. Several analysts compared Forge to the application programming interfaces sold by general purpose language model companies and argued that EvolutionaryScale was building the model layer of a future biological cloud, in which biotech and pharma developers would consume biological intelligence the same way software developers consumed text generation. The acquisition by Biohub complicated this picture, since a nonprofit owner might choose to favor open scientific publication and free academic access over enterprise contracts, but the underlying argument that protein foundation models would be a major category of artificial intelligence persisted.