# Arc Institute

> Source: https://aiwiki.ai/wiki/arc_institute
> Updated: 2026-06-08
> Categories: AI Companies, Open Source AI
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

The **Arc Institute** is an independent, nonprofit biomedical research organization headquartered in Palo Alto, California. Founded in 2021, it pursues long-horizon basic science aimed at understanding and curing complex human diseases, operating in partnership with [Stanford University](/wiki/stanford_university), the [University of California, Berkeley](/wiki/uc_berkeley), and the University of California, San Francisco (UCSF) [1][2]. The institute is notable within [artificial intelligence](/wiki/artificial_intelligence) for its work at the intersection of machine learning and biology, particularly the [Evo](/wiki/evo) and [Evo 2](/wiki/evo_2) DNA foundation models and the State virtual cell model. It was co-founded by bioengineer [Patrick Hsu](/wiki/patrick_hsu), biochemist Silvana Konermann, and [Stripe](/wiki/stripe) chief executive [Patrick Collison](/wiki/patrick_collison), and launched with roughly $650 million in committed philanthropic funding [3][4].

## Overview

Arc describes its mission as accelerating discovery, uncovering the root causes of complex diseases such as cancer, neurodegeneration, and immune dysfunction, and closing the gap between scientific breakthroughs and real-world impact [1]. Rather than operating as a conventional university department or a traditional government-funded laboratory, Arc is structured as a standalone nonprofit research institute that collaborates closely with its three Bay Area university partners, allowing affiliated faculty to maintain academic appointments while conducting research at Arc [2][5].

The institute organizes its work around core laboratories led by principal investigators, technology centers that provide shared experimental and computational infrastructure, and institute-wide initiatives. By 2025 it employed more than 300 staff, internally referred to as "Arconauts" [1]. Its research spans genome engineering, neuroscience, immunology, and a growing body of computational and AI-driven work in genomics and cell biology.

## Founding and the institute model

Arc Institute was launched publicly on December 15, 2021, in collaboration with Stanford, UC Berkeley, and UCSF [4][5]. Its creation grew in part out of the founders' experience with Fast Grants, a rapid science-funding program operated during the COVID-19 pandemic, which highlighted inefficiencies in the conventional grant system [3].

The institute's defining feature is its "core funding" model. Instead of requiring scientists to write and renew federal grants on short cycles, Arc provides multi-year institutional support intended to free researchers to pursue ambitious, high-risk, high-reward projects without continual fundraising pressure [2][3]. Funding is provided through several tiers, summarized below.

| Program | Support | Duration |
| --- | --- | --- |
| Core Investigators | Full institutional funding for a lab (up to roughly 20 people) | About 8 years, renewable |
| Innovation Investigators | Approximately $1 million | 5 years |
| Ignite Awards | Approximately $100,000 | 1 year |

Core Investigators run their own laboratories at Arc, while the Innovation Investigator and Ignite programs extend support to faculty at the partner universities [1][2]. This model is part of a broader movement of philanthropically funded "focused research organizations" and independent institutes that aim to complement, rather than replace, traditional academic and government science.

## Leadership

The institute was co-founded by three people who continue to shape its scientific and organizational direction:

- **Patrick Hsu**, a bioengineer known for work on CRISPR and RNA-guided genome engineering, serves as a co-founder and Core Investigator and holds a faculty appointment at UC Berkeley [1][6]. His laboratory at Arc has produced both experimental genome-editing advances and the institute's flagship AI-for-biology models.
- **Silvana Konermann**, a biochemist and Stanford faculty member, is a co-founder and the institute's executive director [1][7]. She is recognized for earlier work on CRISPR-based gene regulation and leads Arc's day-to-day scientific operations.
- **Patrick Collison**, co-founder and CEO of the payments company Stripe, is a co-founder of Arc and chair of its board [3][4]. He has been a prominent advocate for reforming how scientific research is funded and organized.

Arc's board and executive team have expanded as the institute has grown. Reported board and leadership additions include technology figures such as Nat Friedman and Reid Hoffman, and the institute appointed a chief technology officer in 2024 and named Megan van Overbeek as chief scientific officer in 2026 [3].

## AI for biology

Arc has become one of the more visible nonprofit players in [AI for science](/wiki/ai_for_science), with a particular focus on biological foundation models that learn directly from genomic and cellular data.

### Evo

In 2024, researchers at Arc and Stanford, working with collaborators including Brian Hie and Patrick Hsu, released **Evo**, a [genomic foundation model](/wiki/genomic_foundation_model) trained at single-nucleotide resolution [8][9]. Evo is a 7-billion-parameter model with a context length of about 131,000 tokens, built on the StripedHyena architecture, a signal-processing-based design intended to improve efficiency over the standard [Transformer](/wiki/transformer) at long sequence lengths [8][9].

Evo was trained on roughly 2.7 million prokaryotic and bacteriophage genomes, released as an open dataset called OpenGenome containing about 300 billion tokens [8]. The model demonstrated zero-shot prediction of molecular function across DNA, RNA, and protein modalities, competitive with or exceeding specialized models, and was able to generate experimentally validated CRISPR-Cas complexes and transposable systems, an early demonstration of protein-RNA and protein-DNA co-design with a language model [9]. The work was published in the journal *Science* in November 2024 and was named among *The New York Times* "Good Tech Awards" for that year [3][9].

### Evo 2

In February 2025, Arc and [NVIDIA](/wiki/nvidia), together with collaborators from Stanford, UC Berkeley, UCSF, the University of Washington, and others, released **Evo 2**, a substantially larger DNA foundation model [10][11]. Evo 2 was released openly on February 19, 2025, with code on GitHub and model weights made publicly available, and was integrated into NVIDIA's BioNeMo platform [10][11].

Evo 2 was trained on approximately 9.3 trillion nucleotides drawn from more than 100,000 species and over 128,000 whole genomes spanning all domains of life, including bacteria, archaea, phages, plants, animals, and humans, roughly 30 times the training data of the original Evo [10][12]. The largest version contains 40 billion parameters (a smaller 7-billion-parameter version was also released), uses the StripedHyena 2 architecture, and can process context windows of up to about 1 million nucleotides [10][12]. Training was carried out on more than 2,000 NVIDIA H100 GPUs via NVIDIA DGX Cloud [11].

Evo 2 can both predict and design genetic sequences. Reported capabilities include identifying disease-associated mutations, with the model achieving over 90% accuracy in classifying BRCA1 variants as benign or potentially pathogenic, and generating genome-scale sequences [10][11]. In a widely noted demonstration, researchers used Evo 2 to design bacteriophage genomes; of 285 tested designs for a small, 11-gene organism, 16 successfully propagated and inhibited the growth of target bacterial strains, described as among the first functional AI-designed genomes of their kind [12]. The accompanying research paper was published in *Nature* on March 4, 2026 [12]. By that point, Arc reported that Evo 2 had been downloaded more than 88,000 times on GitHub and that its 7-billion- and 40-billion-parameter variants had together received millions of API requests through Hugging Face [12].

### State and the Virtual Cell Initiative

Beyond genomic sequence modeling, Arc has invested in "virtual cell" models that aim to predict how cells respond to drugs, signaling molecules, and genetic perturbations. In 2025 the institute released **State**, described as its first virtual cell model and one of the largest single-cell perturbation models to date [13]. State predicts shifts in gene expression given a starting transcriptome and a specified perturbation, and was trained on observational data from roughly 170 million cells together with perturbational data from more than 100 million cells across dozens of cell lines, drawing on the Arc Virtual Cell Atlas [13]. To spur progress in the field, Arc also organized a Virtual Cell Challenge in 2025, which drew thousands of registered participants from more than 100 countries [13].

## Funding

Arc was launched with approximately $650 million in committed philanthropic funding, an unusually large endowment-style commitment for a new independent research institute [3][4]. Reported founding donors include Ethereum creator [Vitalik Buterin](/wiki/vitalik_buterin), Stripe co-founders Patrick and John Collison, Asana and Facebook co-founder Dustin Moskovitz and Cari Tuna, angel investor Ron Conway, and other technology entrepreneurs and investors [3]. The institute operates on an annual budget reported at roughly $80 million as of the mid-2020s [3].

Arc's funding base is philanthropic rather than commercial, and it does not raise venture capital, though it has formed research and technology partnerships with companies. Notable collaborations include its work with NVIDIA on Evo 2, announced in early 2025, and a 2025 partnership with 10x Genomics and Ultima Genomics to build large-scale single-cell datasets for virtual cell models [3][13].

## Significance

The Arc Institute is frequently cited as a leading example of a new kind of scientific organization: a well-funded, independent nonprofit that combines long-term institutional support for basic biology with serious investment in machine learning. Its core funding model is part of a broader experiment in research funding intended to give scientists greater freedom to pursue ambitious, long-horizon projects [2][3].

In AI specifically, Arc has helped establish biological foundation models as a major research direction. Evo and Evo 2 are among the most prominent open DNA foundation models, and Arc's collaboration with NVIDIA placed it at the center of efforts to scale genomic AI [10][11]. Together with its virtual cell work, this positions the institute as one of the most influential nonprofit organizations operating at the intersection of artificial intelligence and biology, alongside efforts such as the [Chan Zuckerberg Initiative](/wiki/chan_zuckerberg_initiative) and academic foundation-model groups. The institute is distinct from unrelated entities that share the "Arc" name, such as the [ARC-AGI](/wiki/arc_agi) benchmark and the Arc web browser.

## References

1. Arc Institute. "About." https://arcinstitute.org/about
2. Stanford Report. "Stanford collaboration with Arc Institute aims to expand academic, research and funding opportunities." December 2021. https://news.stanford.edu/stories/2021/12/stanford-collaboration-arc-institute-aims-expand-academic-research-funding-opportunities
3. Wikipedia. "Arc Institute." https://en.wikipedia.org/wiki/Arc_Institute
4. BusinessWire. "The Arc Institute Launches to Accelerate Scientific Breakthroughs in Complex Diseases in Collaboration with Stanford University, UCSF, and UC Berkeley." December 15, 2021. https://www.businesswire.com/news/home/20211215005308/en/
5. UC Berkeley News. "UC Berkeley partners with new Arc Institute to tackle complex diseases." December 15, 2021. https://news.berkeley.edu/2021/12/15/uc-berkeley-partners-with-new-arc-institute-to-tackle-complex-diseases/
6. Wikipedia. "Patrick Hsu." https://en.wikipedia.org/wiki/Patrick_Hsu
7. Wikipedia. "Silvana Konermann." https://en.wikipedia.org/wiki/Silvana_Konermann
8. Arc Institute. "Evo: DNA foundation modeling from molecular to genome scale." 2024. https://arcinstitute.org/news/evo
9. Nguyen, E. et al. "Sequence modeling and design from molecular to genome scale with Evo." *Science*, November 2024. https://www.science.org/doi/10.1126/science.ado9336
10. Arc Institute. "AI can now model and design the genetic code for all domains of life with Evo 2." February 19, 2025. https://arcinstitute.org/news/evo2
11. NVIDIA. "Massive Foundation Model for Biomolecular Sciences Now Available via NVIDIA BioNeMo." February 2025. https://blogs.nvidia.com/blog/evo-2-biomolecular-ai/
12. Arc Institute. "Evo 2: One Year Later." 2026. https://arcinstitute.org/news/evo-2-one-year-later
13. Arc Institute. "Arc Institute's first virtual cell model: State." 2025. https://arcinstitute.org/news/virtual-cell-model-state

