Epoch AI
Last reviewed
Jun 8, 2026
Sources
26 citations
Review status
Source-backed
Revision
v1 ยท 2,362 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
26 citations
Review status
Source-backed
Revision
v1 ยท 2,362 words
Add missing citations, update stale details, or suggest a clearer explanation.
Epoch AI is a nonprofit research organization that studies the trajectory of artificial intelligence through rigorous quantitative analysis of compute, data, algorithms, and economics. Founded in 2022 and originally called simply "Epoch," the organization became widely cited as a neutral, data-first authority on how fast AI capabilities are advancing and what is driving that progress. Its database of notable machine learning models is the canonical public source for figures such as the "training compute doubling time," and its charts of long-run compute trends are reproduced throughout the field. Epoch AI also created the FrontierMath benchmark of research-level mathematics problems, runs an AI Benchmarking Hub, and publishes the Epoch Capabilities Index (ECI), a composite measure of frontier model ability. [1][2][3]
This article concerns the research organization Epoch AI and should not be confused with the epoch concept in machine learning, which refers to a single pass through a training dataset.
Epoch AI describes itself as a "data-first research nonprofit" whose goal is to improve the shared understanding of the drivers, progress, and impacts of artificial intelligence from a neutral, evidence-based perspective. The organization organizes its work around three questions: the drivers of AI progress (training compute, data, hardware, and scaling feasibility), the measurement of capability progress (benchmarks and indices), and the downstream impacts of AI (economics, adoption, and automation). [1][3]
Much of Epoch AI's influence comes from making its underlying datasets and methods public. It maintains open data explorers covering notable AI models, machine learning hardware, GPU clusters and data centers, and leading AI companies, alongside reports, a newsletter, and benchmark results. Its data and analysis have been used by Stanford's AI Index, and the organization has had past and ongoing partnerships with bodies including the UK government's science and AI departments. [3][9]
Epoch grew out of a volunteer effort begun around 2021, when Jaime Sevilla put out a call for collaborators to study historical trends in machine learning. The group's early analysis of training compute drew a strongly positive response, and Sevilla sought philanthropic funding to formalize the project. Epoch was founded in 2022, with Rethink Priorities acting as its initial fiscal sponsor. The founding team included Sevilla, Tamay Besiroglu, Lennart Heim, Pablo Villalobos, Eduardo Infante-Roldan, Marius Hobbhahn, and Anson Ho. The work was carried out in close collaboration with organizations such as Rethink Priorities and the grantmaker Open Philanthropy, which became a major funder. [4][5][6]
Over the following years Epoch grew into an independent organization, rebranding as Epoch AI and registering as an independent 501(c)(3) nonprofit by early 2025. According to its 2025 impact report, the organization had 21 full-time staff, raised about $10.3 million in 2025 (a roughly 40 percent increase over the prior year), and spent about $5 million. Its board of directors included Tom Davidson, Ajeya Cotra, Jaime Sevilla, and Maria de la Lama. [7]
Epoch AI's best-known work is its empirical study of AI compute and scaling laws. The organization built a database of notable machine learning models, where a model is counted as "notable" if it set a state-of-the-art result on a recognized benchmark, was highly cited (on the order of 1,000 or more citations), had clear historical relevance, or saw significant real-world use. By 2024 the database covered hundreds of models with associated training compute estimates, and it has continued to expand since. [2][10]
The dataset was originally assembled for the 2022 report "Compute Trends Across Three Eras of Machine Learning," which identified distinct historical regimes in how the compute used to train AI systems has grown. Epoch's analysis found that since around 2010 the training compute of machine learning models grew by roughly a factor of 4 to 5 per year, with the organization later estimating the rate for frontier models at about 4 to 5 times annually. These figures, and the accompanying charts, became the standard reference points for discussions of AI scaling. [2][8][11]
Beyond raw compute, Epoch AI has produced a series of influential reports on the inputs to AI progress, summarized below.
| Topic | Representative work | Key finding |
|---|---|---|
| Training compute trends | "Compute Trends Across Three Eras of Machine Learning" (2022) | Training compute has grown about 4 to 5x per year since roughly 2010. [2][11] |
| Data limits | "Will We Run Out of Data?" (first released 2022, updated 2024) | The stock of public, human-generated text could be fully used for training at some point between roughly 2026 and 2032. [12][13] |
| Algorithmic progress | Studies of algorithmic efficiency in language models and image recognition | A large share of measured progress comes from improvements in algorithms and data efficiency, not compute alone. [14] |
| Hardware and clusters | Trends in machine learning hardware and AI supercomputers | Tracks GPU performance, cluster sizes, and the growth of large training systems. [9] |
| Economics of scaling | GATE integrated assessment model (2025) | Models how compute investment and automation could drive rapid, even explosive, economic growth. [15] |
The 2024 update to "Will We Run Out of Data?" attracted broad attention for its projection that frontier models could exhaust the supply of high-quality, human-generated public text data around the middle of the decade, a finding widely covered in the press and tied to the rising interest in synthetic data. The report's authors stressed that the projection carried high uncertainty and that data efficiency gains and synthetic data could push the limit back. [12][13]
In 2025 Epoch AI released GATE (Growth and AI Transition Endogenous model), an integrated assessment model developed by Besiroglu, Heim, and Sevilla that links a compute-based model of AI development, an automation framework, and a semi-endogenous economic growth model. GATE is used to explore scenarios in which heavy reinvestment of output into AI hardware and research could produce very rapid growth, while quantifying the uncertainty around how much of the economy can be automated. [15]
FrontierMath is a mathematics benchmark that Epoch AI announced on November 8, 2024. It consists of several hundred original, unpublished problems intended to be far harder than earlier math benchmarks such as GSM8K and MATH, on which leading models had already reached near-perfect scores. The problems were created in collaboration with more than 60 mathematicians from universities across more than a dozen countries, and were designed to have answers that are automatically verifiable while still requiring deep, expert reasoning to produce. [16][17]
Problems span a wide range of difficulty. In the benchmark's main set, tiers run from advanced undergraduate material up through research-level mathematics, with the hardest tier (Tier 4) consisting of research-level problems; Epoch has separately maintained a collection of genuinely open research problems. At launch, frontier models performed extremely poorly: the FrontierMath paper reported that state-of-the-art systems, including OpenAI's o1-preview, GPT-4o, Anthropic's Claude 3.5 Sonnet, Google's Gemini 1.5 Pro, and xAI's Grok 2, each solved under 2 percent of the problems. The benchmark drew commentary from leading mathematicians, including Fields Medalists Terence Tao, Timothy Gowers, and Richard Borcherds, several of whom remarked on the difficulty of the questions. [16][17][18]
FrontierMath became the subject of controversy over how its funding was disclosed. On December 20, 2024, around the time OpenAI announced its o3 model and cited a strong FrontierMath score, Epoch AI revealed that OpenAI had funded the creation of the benchmark. Critics objected that this relationship had not been disclosed earlier, especially given that Epoch was otherwise known as an independent, largely Open Philanthropy funded organization. [19][20]
The dispute intensified after a contractor who had worked on the benchmark, posting on the forum LessWrong under the name "Meemi," wrote that many contributors had not been told of OpenAI's involvement, stating that "the communication about this has been non-transparent" and that "Epoch AI should have disclosed OpenAI funding, and contractors should have transparent information about the potential of their work being used for capabilities." A Stanford mathematics PhD student, Carina Hong, separately reported that several contributing mathematicians said they had been unaware that OpenAI would have access to the problems and would not have contributed had they known. [19][20]
Epoch AI's associate director Tamay Besiroglu acknowledged that the organization had made a mistake on transparency, saying it had been restricted from disclosing the partnership until around the o3 launch and "in hindsight we should have negotiated harder" for the ability to inform contributors sooner. He stated that OpenAI had access to the FrontierMath problems but had a verbal agreement not to train on them, and that Epoch maintained a separate, unseen holdout set to allow independent verification of results. Epoch's lead mathematician, Elliot Glazer, said his personal view was that OpenAI's reported score was legitimate, that is, that the company had not trained on the dataset, but that Epoch could not fully vouch for the figure until its own independent evaluation was complete. Epoch AI's 2025 impact report later described a research-level Tier 4 set of 50 problems as having been commissioned by OpenAI. [7][19][20]
Early in 2025 Epoch AI relaunched an AI Benchmarking Hub, which collects evaluation results reported by model developers and third parties alongside benchmarks that Epoch runs itself. The Hub became one of the organization's most visited pages, and it underpins Epoch's flagship capability metric. [3][7][21]
That metric is the Epoch Capabilities Index (ECI), a composite score that combines results from dozens of distinct benchmarks into a single "general capability" scale, allowing models to be compared even across periods long enough for any one benchmark to saturate. Epoch likens the ECI to an IQ-style measure: rather than tracking performance on a single skill, it aims to capture a broad underlying capability. The index is built on item response theory, the statistical framework used in standardized testing, and functions as a relative measure similar to an Elo rating, jointly estimating both how capable each model is and how difficult each benchmark is from the pattern of results when models are tested on overlapping benchmarks. By late 2025 the ECI drew on more than a thousand evaluations covering on the order of 147 models and roughly 39 underlying benchmarks. Epoch describes the ECI as an independent product over which it has full rights, while noting that the work built on methodology from a Google DeepMind paper, "A Rosetta Stone for AI Benchmarks." [21][22][23]
Jaime Sevilla, a Spanish researcher with a background in mathematics and computer science, founded Epoch and serves as its director. He has become a prominent voice on AI forecasting and the trajectory of transformative AI, and was profiled by TIME in 2024 in connection with Epoch's trend analysis. Tamay Besiroglu was a co-founder and the organization's associate director, leading much of its work on compute and economics. [4][24]
In April 2025, Besiroglu left Epoch AI to co-found Mechanize, a startup whose stated aim is "the full automation of the economy," beginning with white-collar work. Two other Epoch-affiliated researchers, Ege Erdil and Matthew Barnett, joined him. The move generated controversy and public criticism, because Mechanize's goal of automating human labor was seen as standing in tension with the safety-oriented and cautious framing associated with Epoch and parts of the AI-risk community; Mechanize reported raising a seed round of about $7.3 million. Lennart Heim, another co-founder, later led the compute team at the RAND Corporation's center on AI and emerging technology. [25][26]
Epoch AI is generally regarded across the field as an authoritative and relatively neutral source on quantitative AI trends, and its datasets are widely reused by researchers, journalists, and policymakers. Its analyses are frequently cited in coverage of AI scaling, compute, and the data supply. At the same time, the FrontierMath funding episode prompted broader debate about the independence of benchmark organizations and the importance of disclosing industry funding and data access. Epoch's roots in the effective altruism and AI-safety funding ecosystem, and the later departure of several staff to build automation technology, have also featured in commentary about the organization's positioning. [1][19][25]