OpenAI for Science
Last reviewed
Jun 7, 2026
Sources
20 citations
Review status
Source-backed
Revision
v1 · 2,085 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 7, 2026
Sources
20 citations
Review status
Source-backed
Revision
v1 · 2,085 words
Add missing citations, update stale details, or suggest a clearer explanation.
OpenAI for Science was a research initiative and internal team at OpenAI, announced on September 3, 2025, whose stated mission was to build "the next great scientific instrument: an AI-powered platform that accelerates scientific discovery." The effort was led by Kevin Weil, who stepped down as OpenAI's chief product officer to run it as vice president of AI for science. It sat within the broader push to apply frontier large language models such as GPT-5 to research problems in mathematics, physics, biology, and other fields, and it produced the Prism research workspace and the GPT-Rosalind life-sciences model. The team was short-lived: OpenAI began decentralizing it into other research groups in April 2026, when Weil announced his departure. The initiative drew attention both for concrete, externally verified results and for episodes of overstatement, most notably an October 2025 claim about solving Erdos problems that mathematicians said misrepresented what the model had done.
OpenAI for Science was OpenAI's attempt to formalize what it called "AI for science": pairing its reasoning models with the tools, workflows, and expert collaborations needed to make working scientists more productive. Rather than positioning the model as an oracle that produces discoveries on its own, the public framing settled on the idea of AI as a research collaborator that helps a domain expert get unstuck, find overlooked references, sketch proofs, or propose hypotheses. Kevin Weil summarized the team's ambition with a widely quoted line: "I think 2026 will be for science what 2025 was for software engineering," a comparison to the prior year's surge in AI coding tools.
The initiative was a small, dedicated group with a specific charter around scientific discovery, distinct from OpenAI's general research organization. It is closely associated with two deliverables, the Prism workspace and the GPT-Rosalind model, plus a series of published case studies in which OpenAI argued its models had contributed to real research.
Weil announced the team on September 3, 2025, writing that he was "starting something new inside OpenAI" called OpenAI for Science. He moved off the consumer product organization, which passed to Fidji Simo, the former Instacart chief executive who had joined OpenAI as CEO of applications in mid-2025; OpenAI's ChatGPT product leadership, including Nick Turley, reported to Simo after the change. Weil's pitch leaned on his own background: he completed roughly two-thirds of a PhD in particle physics at Stanford before leaving for product roles at Twitter and Instagram, so the move was framed as a return to science.
Weil said he planned to hire a "small team" of academics who were "world-class" in their fields, "completely AI-pilled," and "great science communicators." Researchers publicly associated with the science work include Alexander Wei, the OpenAI researcher who in July 2025 announced that an experimental reasoning model had reached gold-medal-level performance on the International Mathematical Olympiad, and Sebastien Bubeck, a prominent former Microsoft researcher who joined OpenAI in 2024 and who promoted several of the early mathematics results. The team's formal standing inside OpenAI lasted only into 2026 before it was broken up.
The core goal was to compress the time scientists spend on tasks that a capable model can assist with: literature search across decades of papers, checking and tightening proofs, proposing experiments, and analyzing data. OpenAI repeatedly cited capability benchmarks as evidence of readiness; it reported that GPT-5.2 scored 92% on GPQA, a set of graduate-level questions in biology, physics, and chemistry, up from 39% for GPT-4. Weil argued that a model that has effectively "read substantially every paper written in the last 30 years" can surface connections a single researcher would miss.
A recurring theme was epistemic caution. After early overstatements, OpenAI described work on "epistemological humility," meaning training models to present findings as suggestions rather than definitive answers, and on using a model to fact-check its own output through critic-style workflows. Weil also said internally that the longer-term aim was a more autonomous research agent later in the decade, while characterizing the near-term mission as acceleration rather than independent breakthroughs.
OpenAI published a set of case studies, "Early science acceleration experiments with GPT-5," in which it described specific contributions by the model and attributed them to named researchers. Several were corroborated by the scientists themselves:
In mathematics, mathematician Terence Tao reported that GPT-5.2 Pro solved an Erdos problem "more or less autonomously." The most prominent clean result came on May 20, 2026, when OpenAI said an internal reasoning model had disproved a planar unit-distance conjecture that Paul Erdos first posed in 1946, finding what OpenAI described as "an entirely new family of constructions" that outperforms the long-assumed square-grid arrangement. Crucially, OpenAI released the work alongside companion remarks from external mathematicians, including Noga Alon, Melanie Wood, and Thomas Bloom, who verified the proof; Bloom commented that "AI is helping us to more fully explore the cathedral of mathematics we have built over the centuries."
The initiative was a vehicle for OpenAI's frontier models rather than a separate model itself. The case studies ran on GPT-5 and its successors, and the products built by the team plugged those models into scientist-facing tools.
Prism, launched on January 27, 2026, is an AI-assisted workspace for writing scientific papers. OpenAI described it as deeply integrated with GPT-5.2: it assesses claims, revises prose, searches prior research, supports LaTeX, and can turn whiteboard sketches into diagrams using the model's vision capability. It was offered free to anyone with a ChatGPT account and framed as augmentation rather than an autonomous research conductor. OpenAI said ChatGPT was already receiving roughly 8.4 million weekly messages on advanced hard-science topics, though it noted it could not say how many came from professional researchers.
GPT-Rosalind, announced on April 16, 2026 and named after Rosalind Franklin, was a frontier reasoning model aimed at life-sciences research: drug discovery, genomics, and protein reasoning. OpenAI said it could plan multi-step workflows and call external tools such as AlphaFold 3, reported a score of 0.751 Pass@1 on the BixBench bioinformatics benchmark, and gated it behind a trusted-access program with a safety review. Early collaborators named by OpenAI included Amgen, Moderna, Novo Nordisk, Thermo Fisher Scientific, and the Allen Institute. These efforts connect OpenAI for Science to OpenAI's wider generative AI product line and to its GPT-5.2 generation of models.
The clearest cautionary episode came in October 2025. Weil posted, then deleted, a claim that "GPT-5 found solutions to 10 (!) previously unsolved Erdos problems and made progress on 11 others." Thomas Bloom, who maintains the Erdos Problems website, called the post "a dramatic misrepresentation," explaining that listing a problem as "open" only meant he was personally unaware of a paper solving it; GPT-5 had located existing references, at least one written in German, not produced new solutions. Google DeepMind chief executive Demis Hassabis called the episode "embarrassing," and Meta chief AI scientist Yann LeCun mocked it. Bubeck conceded that "only solutions in the literature were found," while arguing literature search is itself hard. The incident became a reference point for AI-hype skepticism, and OpenAI's later messaging, including the May 2026 Erdos result with pre-arranged external verification, was visibly more measured.
Independent assessments stayed mixed. Andy Cooper of the University of Liverpool said, "We have not found, yet, that LLMs are fundamentally changing the way that science is done," and physicist Jonathan Oppenheim flagged a peer-reviewed paper in which a GPT-5 idea tested the wrong thing, comparing it to confusing two different diagnostic tests. The strongest endorsements, such as Aaronson's, came wrapped in caveats about the model's need for expert supervision.
The initiative's organizational fate underscored the uncertainty. On April 17, 2026, Weil announced his last day at OpenAI, saying OpenAI for Science was "being decentralized into other research teams"; Sora creator Bill Peebles left the same day amid a broader OpenAI consolidation around enterprise products and away from what the company called "side quests." OpenAI for Science thus ended its run as a standalone team less than a year after it began, leaving behind Prism, GPT-Rosalind, a contested but partly verified record of scientific contributions, and an ongoing debate about how much frontier models can genuinely accelerate discovery.
| Item | Detail | Source |
|---|---|---|
| Announced | September 3, 2025 | PYMNTS; Kevin Weil (X) |
| Leader | Kevin Weil, VP of AI for Science (former CPO) | MIT Tech Review; PYMNTS |
| Mission | "The next great scientific instrument: an AI-powered platform that accelerates scientific discovery" | Kevin Weil (X) |
| Notable team members | Alexander Wei (IMO gold), Sebastien Bubeck (ex-Microsoft) | Techmeme; The Information |
| Prism workspace | Launched January 27, 2026; GPT-5.2; free with ChatGPT account; LaTeX support | TechCrunch |
| GPT-Rosalind | Life-sciences reasoning model, announced April 16, 2026; 0.751 BixBench Pass@1 | OpenAI; Fierce Biotech; Euronews |
| Verified math result | Disproof of a 1946 Erdos planar unit-distance conjecture, May 20, 2026; verified by Alon, Wood, Bloom | TechCrunch; technology.org |
| Quantum result | GPT-5 supplied a key step in Aaronson and Witteveen, "Limits to black-box amplification in QMA," arXiv October 2, 2025 | Quantum Insider; OfficeChai |
| October 2025 controversy | Weil's deleted claim of 10 "unsolved" Erdos problems; Bloom: "a dramatic misrepresentation"; Hassabis: "embarrassing" | TechCrunch; TechBuzz |
| Wind-down | Decentralized into other teams; Weil and Bill Peebles departed April 17, 2026 | TechCrunch; CNBC |