Dan Hendrycks
Last reviewed
May 2, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 2,855 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 2, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 2,855 words
Add missing citations, update stale details, or suggest a clearer explanation.
Dan Hendrycks (born 1994 or 1995) is an American machine learning researcher and the executive director of the Center for AI Safety, a San Francisco nonprofit he co-founded in 2022. He is best known as the lead author of the MMLU benchmark (2020), as a co-author of the GELU activation function (2016), and as the organizer of the May 2023 Statement on AI Risk, a single-sentence declaration on extinction risk signed by hundreds of AI researchers and executives. Hendrycks also advises xAI on safety and is a paid-in-name-only advisor to Scale AI, accepting a token one-dollar salary at each company to limit financial conflicts. His policy work includes co-shaping California's SB 1047 frontier AI safety bill in 2024, and his benchmarks (MMLU, MATH, WMDP, Humanity's Last Exam) are among the most widely used evaluations in modern AI safety and capabilities research.
| Born | 1994 or 1995, Marshfield, Missouri |
| Alma mater | University of Chicago (BS, 2018); UC Berkeley (PhD, 2022) |
| Doctoral advisors | Jacob Steinhardt; Dawn Song |
| Known for | GELU activation, MMLU, MATH, WMDP, Humanity's Last Exam, Statement on AI Risk |
| Employer | Center for AI Safety (executive director) |
| Other roles | xAI (safety advisor, 2023 to present); Scale AI (advisor, 2024 to present); Gray Swan AI (unpaid advisor) |
| Fields | Machine learning, AI safety, robustness, machine ethics |
Hendrycks grew up in Marshfield, Missouri, in a Christian evangelical family. He was valedictorian of Marshfield High School in 2014. He earned a BS from the University of Chicago in 2018, then moved to UC Berkeley for graduate study in computer science, where his advisors were Jacob Steinhardt and Dawn Song. He completed his PhD in 2022, focusing on machine learning safety, robustness, and out-of-distribution generalization. His graduate work was supported by the NSF Graduate Research Fellowship and the Open Philanthropy AI Fellowship.
Hendrycks has said that the 80,000 Hours career-advice program, which is associated with the effective altruism movement, helped steer him toward AI safety, although he has declined to describe himself as an EA advocate.
Most of Hendrycks's foundational technical work was done as a Berkeley graduate student, in collaboration with co-authors including Steinhardt, Song, Collin Burns, Mantas Mazeika, Andy Zou, and Steven Basart. The early focus was on detecting failures in deep networks: the 2016 paper A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks (with Kevin Gimpel, ICLR 2017) introduced the maximum-softmax-probability baseline that became the standard reference point for the entire out-of-distribution detection literature. Subsequent papers on common-corruption robustness (ImageNet-C and ImageNet-P, ICLR 2019) and natural adversarial examples (ImageNet-A and ImageNet-O, CVPR 2021) gave the field standardized robustness benchmarks. By the end of his PhD he had also published the MMLU, MATH, and ETHICS benchmarks discussed below.
In 2022, Hendrycks co-founded the Center for AI Safety (CAIS) with Oliver Zhang, who is the managing director. CAIS is a 501(c)(3) nonprofit headquartered in San Francisco; its stated mission is to reduce societal-scale risks from AI through technical research, field-building, and policy advocacy. Under Hendrycks's leadership, CAIS has run the Intro to ML Safety course, operated a compute cluster used by external safety researchers, sponsored a philosophy fellowship on conceptual AI safety questions, and produced benchmarks including WMDP, HarmBench, and Humanity's Last Exam.
When Elon Musk founded xAI in July 2023, Hendrycks signed on as its safety advisor. According to contemporaneous reporting, the connection began when Hendrycks emailed xAI co-founder Igor Babuschkin to ask how the new lab planned to handle safety. To limit any appearance of financial conflict, Hendrycks accepted a token one-dollar annual salary and took no equity in xAI.
In November 2024, Hendrycks joined Scale AI as an advisor on a similar token-pay structure (twelve dollars per year, no equity). Scale AI had also been one of CAIS's collaborators on Humanity's Last Exam earlier that year.
Hendrycks was an early backer of Gray Swan AI, an AI security and red-teaming startup that publicly launched in July 2024. Gray Swan was founded by Carnegie Mellon faculty Matt Fredrikson and Zico Kolter together with PhD student Andy Zou, all of whom had collaborated with Hendrycks on research such as the Circuit Breakers alignment technique. Hendrycks initially held an equity stake in the company, but in July 2024, after critics raised conflict-of-interest questions tied to his SB 1047 advocacy, he announced on X that he was divesting his entire stake and continuing only as an unpaid advisor. He is not listed by Gray Swan as a co-founder.
The paper Gaussian Error Linear Units (GELUs), written by Hendrycks with Kevin Gimpel as an undergraduate project at the Toyota Technological Institute at Chicago, introduced an activation function defined as f(x) = x . Phi(x), where Phi is the standard Gaussian cumulative distribution function. Unlike ReLU, which gates inputs by sign, GELU weights inputs smoothly by their value under a Gaussian assumption. Empirical experiments in the paper showed gains over ReLU and ELU on vision, language, and speech tasks. GELU was later adopted as the default activation in BERT, the GPT family, vision transformers, and most subsequent transformer-based models, making it one of the most widely used nonlinearities in modern deep learning. The paper is posted on arXiv as 1606.08415.
A companion paper with Gimpel, A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks (arXiv 1610.02136, ICLR 2017), proposed using the maximum predicted softmax probability as a simple baseline for flagging unfamiliar inputs. The method became the standard reference baseline that nearly every later OOD detection method (ODIN, Mahalanobis, energy-based scores, and others) compares against.
With Thomas Dietterich, Hendrycks introduced ImageNet-C and ImageNet-P in Benchmarking Neural Network Robustness to Common Corruptions and Perturbations (ICLR 2019). ImageNet-C applies 15 types of synthetic corruption (blur, noise, weather, and digital artifacts) at five severity levels to ImageNet images; it is now standard for measuring image classifier robustness. The follow-up Natural Adversarial Examples paper (CVPR 2021) released ImageNet-A, a curated set of 7,500 unmodified natural images on which standard ImageNet classifiers collapse to near-random accuracy.
The paper Measuring Massive Multitask Language Understanding (arXiv 2009.03300), released on September 7, 2020 with Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Steinhardt, introduced MMLU. MMLU contains 15,908 multiple-choice questions across 57 subjects ranging from elementary mathematics and US history to professional law and microbiology. At release, GPT-3 175B scored 43.9 percent (chance is 25 percent) and human experts scored about 89.8 percent. MMLU became the default capability evaluation for large language models for several years and was downloaded more than 100 million times by mid-2024 before frontier models began saturating it above 88 percent.
Measuring Mathematical Problem Solving With the MATH Dataset (arXiv 2103.03874, NeurIPS 2021), with Burns, Saurav Kadavath, Akul Arora, Basart, Eric Tang, Song, and Steinhardt, released a corpus of 12,500 high-school competition problems drawn from contests such as the AMC 10, AMC 12, and AIME. Each problem ships with a worked step-by-step solution. At release, the strongest models scored between 3 and 6.9 percent. The dataset later became standard for evaluating chain-of-thought and reasoning capabilities and is universally referred to as the MATH benchmark.
Aligning AI With Shared Human Values (arXiv 2008.02275, ICLR 2021), with Burns, Basart, Andrew Critch, Jerry Li, Song, and Steinhardt, introduced the ETHICS dataset, which tests a model's ability to make commonsense moral judgments across justice, well-being, duties, virtues, and deontology. ETHICS was one of the first widely used benchmarks for machine ethics in language models.
The Weapons of Mass Destruction Proxy benchmark, WMDP, was released in March 2024 (arXiv 2403.03218) by CAIS together with Scale AI and a consortium of more than twenty academic institutions. WMDP contains 3,668 multiple-choice questions that proxy hazardous knowledge in biosecurity, cybersecurity, and chemical security. The questions were filtered to remove anything operationally dangerous; they probe adjacent knowledge that correlates with weapons-relevant expertise. The same paper introduced RMU (Representation Misdirection for Unlearning), a method for stripping hazardous knowledge from a model while preserving general capability. WMDP is now a common evaluation in frontier model risk assessments and informed the dual-use questions in California's SB 1047.
Humanity's Last Exam (HLE), a joint project of CAIS and Scale AI released in early 2025 (arXiv 2501.14249), is a closed-ended academic benchmark of 2,500 expert-written questions across mathematics, physics, biology, medicine, humanities, computer science, chemistry, and engineering, with about 14 percent multimodal and 24 percent multiple-choice. Hendrycks has said the project began after a conversation with Elon Musk in which Musk argued that MMLU had become too easy. Questions were crowdsourced from subject-matter experts worldwide, screened by frontier LLMs as a difficulty filter, and reviewed in two human-expert rounds; a 500,000-dollar prize pool funded the strongest submissions. At release the best model scored under 10 percent. The benchmark was published in Nature.
In June 2023, Hendrycks and CAIS colleagues posted An Overview of Catastrophic AI Risks (arXiv 2306.12001), a long-form survey that organizes risks into four buckets: malicious use by humans, AI race dynamics that pressure labs to cut corners, organizational accidents in AI labs, and rogue AIs that pursue misaligned goals. The paper has become one of the most cited single references for the catastrophic-risk framing inside the AI safety literature. In 2024, Hendrycks published the textbook Introduction to AI Safety, Ethics, and Society (Routledge), based on the curriculum from his Intro to ML Safety course.
Other conceptual papers include X-Risk Analysis for AI Research (2022) and Natural Selection Favors AIs over Humans (March 2023), which argues that competitive selection pressures could push AI systems toward goals misaligned with human interests.
On May 30, 2023, CAIS released a one-sentence statement coordinated by Hendrycks:
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.
The minimalist format was reportedly suggested by University of Cambridge researcher David Krueger as a way to attract a broad coalition without the disagreements that had bogged down earlier open letters. The statement carried more than 350 initial signatures and grew over the following weeks. Signatories included Geoffrey Hinton and Yoshua Bengio (the two most-cited AI researchers and Turing laureates), Sam Altman (OpenAI), Demis Hassabis (Google DeepMind), Dario Amodei (Anthropic), Ilya Sutskever, Stuart Russell, Bill Gates, Max Tegmark, Peter Singer, Ray Kurzweil, US Representative Ted Lieu, and Harvard professor emeritus Laurence Tribe. The statement marked the first time that the leaders of the three frontier US AI labs publicly endorsed the framing of AI as an existential risk, and it was widely covered by The New York Times, the BBC, and other outlets.
In 2024, Hendrycks and CAIS were closely involved in shaping California Senate Bill 1047, the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act, authored by State Senator Scott Wiener. CAIS is generally credited as the bill's principal sponsor. SB 1047 would have required developers of frontier models (defined as those costing over 100 million dollars to train or trained on more than 10^26 floating-point operations) to perform safety testing, implement a kill switch, and report critical incidents. The bill passed both chambers of the California legislature with bipartisan votes; Hendrycks called the assembly vote a "landmark moment for AI safety." Governor Gavin Newsom vetoed the bill in September 2024, arguing that it focused on model size rather than deployment context.
Hendrycks's role attracted scrutiny because of his then-equity stake in Gray Swan AI, a company that built tools for assessing AI risk. He divested that stake in July 2024 and continued as an unpaid Gray Swan advisor. SB 1047 was supported in writing by Hinton and Bengio.
Hendrycks co-authored NIST recommendations on AI risk management in early 2022 and has testified and briefed lawmakers in the United States, the United Kingdom, and at multilateral AI safety summits. CAIS has also organized policy workshops and published policy briefs on frontier model governance.
Hendrycks was named to the TIME 100 Most Influential People in AI in 2023, in the Thinkers category. He is an AI2050 Senior Fellow at Schmidt Sciences. His Google Scholar profile lists tens of thousands of citations, with the GELU and MMLU papers each among the most cited recent works in machine learning.
| Year | Title | Venue / arXiv | Notes |
|---|---|---|---|
| 2016 | Gaussian Error Linear Units (GELUs) | arXiv:1606.08415 | With Kevin Gimpel; introduced GELU, now used in BERT, GPT, ViT |
| 2016 | A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks | arXiv:1610.02136; ICLR 2017 | With Gimpel; standard OOD baseline |
| 2019 | Benchmarking Neural Network Robustness to Common Corruptions and Perturbations | ICLR 2019 | Introduced ImageNet-C, ImageNet-P |
| 2020 | Measuring Massive Multitask Language Understanding | arXiv:2009.03300; ICLR 2021 | Introduced MMLU; 57 subjects, 15,908 questions |
| 2020 | Aligning AI With Shared Human Values | arXiv:2008.02275; ICLR 2021 | Introduced ETHICS benchmark |
| 2021 | Natural Adversarial Examples | CVPR 2021 | Introduced ImageNet-A and ImageNet-O |
| 2021 | Measuring Mathematical Problem Solving With the MATH Dataset | arXiv:2103.03874; NeurIPS 2021 | Introduced MATH benchmark |
| 2022 | X-Risk Analysis for AI Research | arXiv:2206.05862 | Conceptual framework for safety research |
| 2023 | An Overview of Catastrophic AI Risks | arXiv:2306.12001 | Four-bucket risk taxonomy |
| 2023 | Natural Selection Favors AIs over Humans | arXiv:2303.16200 | Evolutionary argument for AI risk |
| 2024 | The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | arXiv:2403.03218 | Introduced WMDP and RMU unlearning |
| 2024 | Improving Alignment and Robustness with Circuit Breakers | arXiv:2406.04313 | Representation-control defense, with Zou and Kolter |
| 2024 | Introduction to AI Safety, Ethics, and Society | Routledge (book) | Textbook based on his ML safety course |
| 2025 | Humanity's Last Exam | arXiv:2501.14249; Nature 2025 | With Scale AI; introduced HLE |
| Organization | Role | Years |
|---|---|---|
| University of Chicago | BS student | 2014 to 2018 |
| UC Berkeley | PhD student (advisors Steinhardt and Song) | 2018 to 2022 |
| Center for AI Safety | Co-founder, executive director | 2022 to present |
| xAI | Safety advisor (one-dollar salary, no equity) | 2023 to present |
| Schmidt Sciences AI2050 | Senior Fellow | 2024 to present |
| Gray Swan AI | Advisor (no equity after July 2024 divestment) | 2024 to present |
| Scale AI | Advisor (twelve-dollar salary, no equity) | November 2024 to present |