Dan Hendrycks

AI Safety People

15 min read

Updated Jun 22, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 22, 2026

Fact-checked

In review queue

Sources

23 citations

Revision

v3 · 3,093 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Dan Hendrycks (born 1994 or 1995) is an American machine learning researcher who serves as executive director of the Center for AI Safety, the San Francisco nonprofit he co-founded in 2022, and is the lead author of the MMLU, MATH, and Humanity's Last Exam benchmarks and co-author of the GELU activation function used in BERT, the GPT family, and most modern transformers.^[1]^[2]^[3] He is also the researcher who organized the May 2023 Statement on AI Risk, a single-sentence declaration that extinction risk from AI should be a global priority, signed by hundreds of AI researchers and executives.^[1]^[12] Hendrycks advises xAI on safety for a symbolic one-dollar annual salary and is an advisor to Scale AI for twelve dollars per year, taking no equity at either company to limit financial conflicts.^[19] His policy work includes co-shaping California's SB 1047 frontier AI safety bill in 2024, and his evaluations (MMLU, MATH, WMDP, Humanity's Last Exam) are among the most widely used benchmarks in modern AI safety and capabilities research.^[1]^[17]


Born	1994 or 1995, Marshfield, Missouri
Alma mater	University of Chicago (BS, 2018); UC Berkeley (PhD, 2022)
Doctoral advisors	Jacob Steinhardt; Dawn Song
Known for	GELU activation, MMLU, MATH, WMDP, Humanity's Last Exam, Statement on AI Risk
Employer	Center for AI Safety (executive director)
Other roles	xAI (safety advisor, 2023 to present); Scale AI (advisor, 2024 to present); Gray Swan AI (unpaid advisor)
Fields	Machine learning, AI safety, robustness, machine ethics

Who is Dan Hendrycks?

Dan Hendrycks is one of the most prolific benchmark designers in machine learning and a leading public voice on catastrophic AI risk. His Google Scholar profile lists tens of thousands of citations, with the GELU and MMLU papers each among the most cited recent works in the field.^[3] In a single career he both built tools that measure how capable AI systems are (MMLU, MATH, ImageNet-C, Humanity's Last Exam) and helped popularize the framing of advanced AI as an existential risk, most visibly through the 2023 Statement on AI Risk.^[1]^[12] He was named to the TIME 100 Most Influential People in AI in 2023 in the Thinkers category.^[18]

Early life and education

Hendrycks grew up in Marshfield, Missouri, in a Christian evangelical family.^[1] He was valedictorian of Marshfield High School in 2014.^[1] He earned a BS from the University of Chicago in 2018, then moved to UC Berkeley for graduate study in computer science, where his advisors were Jacob Steinhardt and Dawn Song.^[1]^[3] He completed his PhD in 2022, focusing on machine learning safety, robustness, and out-of-distribution generalization.^[1]^[3] His graduate work was supported by the NSF Graduate Research Fellowship and the Open Philanthropy AI Fellowship.^[3]

Hendrycks has said that the 80,000 Hours career-advice program, which is associated with the effective altruism movement, helped steer him toward AI safety, although he has declined to describe himself as an EA advocate.^[1]

Career

Berkeley research (2018 to 2022)

Most of Hendrycks's foundational technical work was done as a Berkeley graduate student, in collaboration with co-authors including Steinhardt, Song, Collin Burns, Mantas Mazeika, Andy Zou, and Steven Basart.^[3] The early focus was on detecting failures in deep networks: the 2016 paper A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks (with Kevin Gimpel, ICLR 2017) introduced the maximum-softmax-probability baseline that became the standard reference point for the entire out-of-distribution detection literature.^[5] Subsequent papers on common-corruption robustness (ImageNet-C and ImageNet-P, ICLR 2019) and natural adversarial examples (ImageNet-A and ImageNet-O, CVPR 2021) gave the field standardized robustness benchmarks.^[3] By the end of his PhD he had also published the MMLU, MATH, and ETHICS benchmarks discussed below.^[6]^[8]

Why did Hendrycks found the Center for AI Safety?

In 2022, Hendrycks co-founded the Center for AI Safety (CAIS) with Oliver Zhang, who is the managing director.^[2]^[14] CAIS is a 501(c)(3) nonprofit headquartered in San Francisco; its stated mission is to reduce societal-scale risks from AI through technical research, field-building, and policy advocacy.^[2]^[14] Under Hendrycks's leadership, CAIS has run the Intro to ML Safety course, operated a compute cluster used by external safety researchers, sponsored a philosophy fellowship on conceptual AI safety questions, and produced benchmarks including WMDP, HarmBench, and Humanity's Last Exam.^[2]^[14]

xAI safety advisor (2023 to present)

When Elon Musk founded xAI in July 2023, Hendrycks signed on as its safety advisor.^[1]^[19] According to contemporaneous reporting, the connection began when Hendrycks emailed xAI co-founder Igor Babuschkin to ask how the new lab planned to handle safety.^[19] To limit any appearance of financial conflict, Hendrycks accepted a token one-dollar annual salary and took no equity in xAI.^[19]

Scale AI advisor (2024 to present)

In November 2024, Hendrycks joined Scale AI as an advisor on a similar token-pay structure (twelve dollars per year, no equity).^[19] Announcing the role, Hendrycks joked that "This advisory role pays me 12x my xAI role, namely $12 a year instead of $1."^[19] Scale AI had also been one of CAIS's collaborators on Humanity's Last Exam earlier that year.^[11]^[19]

Gray Swan AI advisor (2024 to present)

Hendrycks was an early backer of Gray Swan AI, an AI security and red-teaming startup that publicly launched in July 2024.^[21] Gray Swan was founded by Carnegie Mellon faculty Matt Fredrikson and Zico Kolter together with PhD student Andy Zou, all of whom had collaborated with Hendrycks on research such as the Circuit Breakers alignment technique.^[20]^[21] Hendrycks initially held an equity stake in the company, but in July 2024, after critics raised conflict-of-interest questions tied to his SB 1047 advocacy, he announced on X that he was divesting his entire stake and continuing only as an unpaid advisor.^[20] He is not listed by Gray Swan as a co-founder.^[20]

Research contributions

What is the GELU activation function? (2016)

The paper Gaussian Error Linear Units (GELUs), written by Hendrycks with Kevin Gimpel as an undergraduate project at the Toyota Technological Institute at Chicago, introduced an activation function defined as f(x) = x . Phi(x), where Phi is the standard Gaussian cumulative distribution function.^[4] Unlike ReLU, which gates inputs by sign, GELU weights inputs smoothly by their value under a Gaussian assumption.^[4] Empirical experiments in the paper showed gains over ReLU and ELU on vision, language, and speech tasks.^[4] GELU was later adopted as the default activation in BERT, the GPT family, vision transformers, and most subsequent transformer-based models, making it one of the most widely used nonlinearities in modern deep learning.^[4] The paper is posted on arXiv as 1606.08415.^[4]

Out-of-distribution detection (2016 to 2017)

A companion paper with Gimpel, A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks (arXiv 1610.02136, ICLR 2017), proposed using the maximum predicted softmax probability as a simple baseline for flagging unfamiliar inputs.^[5] The method became the standard reference baseline that nearly every later OOD detection method (ODIN, Mahalanobis, energy-based scores, and others) compares against.^[5]

Robustness benchmarks (2019 to 2021)

With Thomas Dietterich, Hendrycks introduced ImageNet-C and ImageNet-P in Benchmarking Neural Network Robustness to Common Corruptions and Perturbations (ICLR 2019).^[3] ImageNet-C applies 15 types of synthetic corruption (blur, noise, weather, and digital artifacts) at five severity levels to ImageNet images; it is now standard for measuring image classifier robustness.^[3] The follow-up Natural Adversarial Examples paper (CVPR 2021) released ImageNet-A, a curated set of 7,500 unmodified natural images on which standard ImageNet classifiers collapse to near-random accuracy.^[3]

What is the MMLU benchmark? (2020)

The paper Measuring Massive Multitask Language Understanding (arXiv 2009.03300), released on September 7, 2020 with Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Steinhardt, introduced MMLU.^[6] MMLU contains 15,908 multiple-choice questions across 57 subjects ranging from elementary mathematics and US history to professional law and microbiology.^[6]^[16] At release, GPT-3 175B scored 43.9 percent (chance is 25 percent) and human experts scored about 89.8 percent.^[6] MMLU became the default capability evaluation for large language models for several years and was downloaded more than 100 million times by mid-2024 before frontier models began saturating it above 88 percent.^[16]

MATH benchmark (2021)

Measuring Mathematical Problem Solving With the MATH Dataset (arXiv 2103.03874, NeurIPS 2021), with Burns, Saurav Kadavath, Akul Arora, Basart, Eric Tang, Song, and Steinhardt, released a corpus of 12,500 high-school competition problems drawn from contests such as the AMC 10, AMC 12, and AIME.^[8] Each problem ships with a worked step-by-step solution.^[8] At release, the strongest models scored between 3 and 6.9 percent.^[8] The dataset later became standard for evaluating chain-of-thought and reasoning capabilities and is universally referred to as the MATH benchmark.^[8]

ETHICS benchmark (2020 to 2021)

Aligning AI With Shared Human Values (arXiv 2008.02275, ICLR 2021), with Burns, Basart, Andrew Critch, Jerry Li, Song, and Steinhardt, introduced the ETHICS dataset, which tests a model's ability to make commonsense moral judgments across justice, well-being, duties, virtues, and deontology.^[7] ETHICS was one of the first widely used benchmarks for machine ethics in language models.^[7]

WMDP benchmark and unlearning (2024)

The Weapons of Mass Destruction Proxy benchmark, WMDP, was released in March 2024 (arXiv 2403.03218) by CAIS together with Scale AI and a consortium of more than twenty academic institutions.^[10] WMDP contains 3,668 multiple-choice questions that proxy hazardous knowledge in biosecurity, cybersecurity, and chemical security.^[10] The questions were filtered to remove anything operationally dangerous; they probe adjacent knowledge that correlates with weapons-relevant expertise.^[10] The same paper introduced RMU (Representation Misdirection for Unlearning), a method for stripping hazardous knowledge from a model while preserving general capability.^[10] WMDP is now a common evaluation in frontier model risk assessments and informed the dual-use questions in California's SB 1047.^[10]^[17]

What is Humanity's Last Exam? (2025)

Humanity's Last Exam (HLE), a joint project of CAIS and Scale AI released in early 2025 (arXiv 2501.14249), is a closed-ended academic benchmark of 2,500 expert-written questions across mathematics, physics, biology, medicine, humanities, computer science, chemistry, and engineering, with about 14 percent multimodal and 24 percent multiple-choice.^[11]^[15] It was a global collaborative effort drawing on nearly 1,000 contributors from more than 500 institutions across 50 countries, most of them active researchers or professors.^[11]^[15] Hendrycks has said the project began after a conversation with Elon Musk in which Musk argued that MMLU had become too easy.^[15] Questions were crowdsourced from subject-matter experts worldwide, screened by frontier LLMs as a difficulty filter, and reviewed in two human-expert rounds; a 500,000-dollar prize pool funded the strongest submissions.^[11]^[15] At release the best model scored under 10 percent, and as of 2026 frontier systems had not broken roughly 40 percent, illustrating how much harder HLE is than the benchmarks it replaced.^[11]^[23] The benchmark was published in Nature.^[11]

Catastrophic AI risk writings

In June 2023, Hendrycks and CAIS colleagues posted An Overview of Catastrophic AI Risks (arXiv 2306.12001), a long-form survey that organizes risks into four buckets: malicious use by humans, AI race dynamics that pressure labs to cut corners, organizational accidents in AI labs, and rogue AIs that pursue misaligned goals.^[9] The paper has become one of the most cited single references for the catastrophic-risk framing inside the AI safety literature.^[9] In 2024, Hendrycks published the textbook Introduction to AI Safety, Ethics, and Society (Routledge), based on the curriculum from his Intro to ML Safety course.^[1]

Other conceptual papers include X-Risk Analysis for AI Research (2022) and Natural Selection Favors AIs over Humans (March 2023), which argues that competitive selection pressures could push AI systems toward goals misaligned with human interests.^[1]

What is the Statement on AI Risk? (May 2023)

On May 30, 2023, CAIS released a one-sentence statement coordinated by Hendrycks:^[12]^[13]

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The minimalist format was reportedly suggested by University of Cambridge researcher David Krueger as a way to attract a broad coalition without the disagreements that had bogged down earlier open letters.^[12] The statement carried more than 350 initial signatures and grew over the following weeks.^[12]^[13] Signatories included Geoffrey Hinton and Yoshua Bengio (the two most-cited AI researchers and Turing laureates), Sam Altman (OpenAI), Demis Hassabis (Google DeepMind), Dario Amodei (Anthropic), Ilya Sutskever, Stuart Russell, Bill Gates, Max Tegmark, Peter Singer, Ray Kurzweil, US Representative Ted Lieu, and Harvard professor emeritus Laurence Tribe.^[12]^[13] The statement marked the first time that the leaders of the three frontier US AI labs publicly endorsed the framing of AI as an existential risk, and it was widely covered by The New York Times, the BBC, and other outlets.^[12]

Policy work

SB 1047 in California (2024)

In 2024, Hendrycks and CAIS were closely involved in shaping California Senate Bill 1047, the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act, authored by State Senator Scott Wiener.^[17] CAIS is generally credited as the bill's principal sponsor.^[17] SB 1047 would have required developers of frontier models (defined as those costing over 100 million dollars to train or trained on more than 10^26 floating-point operations) to perform safety testing, implement a kill switch, and report critical incidents.^[17] The bill passed both chambers of the California legislature with bipartisan votes; Hendrycks called the assembly vote a "landmark moment for AI safety."^[17] Governor Gavin Newsom vetoed the bill in September 2024, arguing that it focused on model size rather than deployment context.^[17]

Hendrycks's role attracted scrutiny because of his then-equity stake in Gray Swan AI, a company that built tools for assessing AI risk.^[20] He divested that stake in July 2024 and continued as an unpaid Gray Swan advisor.^[20] SB 1047 was supported in writing by Hinton and Bengio.^[17]

Federal and international advocacy

Hendrycks co-authored NIST recommendations on AI risk management in early 2022 and has testified and briefed lawmakers in the United States, the United Kingdom, and at multilateral AI safety summits.^[1]^[3] CAIS has also organized policy workshops and published policy briefs on frontier model governance.^[2]

Recognition

Hendrycks was named to the TIME 100 Most Influential People in AI in 2023, in the Thinkers category.^[18] He is an AI2050 Senior Fellow at Schmidt Sciences.^[22] His Google Scholar profile lists tens of thousands of citations, with the GELU and MMLU papers each among the most cited recent works in machine learning.^[3]

Selected publications

Year	Title	Venue / arXiv	Notes
2016	Gaussian Error Linear Units (GELUs)	arXiv:1606.08415	With Kevin Gimpel; introduced GELU, now used in BERT, GPT, ViT
2016	A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks	arXiv:1610.02136; ICLR 2017	With Gimpel; standard OOD baseline
2019	Benchmarking Neural Network Robustness to Common Corruptions and Perturbations	ICLR 2019	Introduced ImageNet-C, ImageNet-P
2020	Measuring Massive Multitask Language Understanding	arXiv:2009.03300; ICLR 2021	Introduced MMLU; 57 subjects, 15,908 questions
2020	Aligning AI With Shared Human Values	arXiv:2008.02275; ICLR 2021	Introduced ETHICS benchmark
2021	Natural Adversarial Examples	CVPR 2021	Introduced ImageNet-A and ImageNet-O
2021	Measuring Mathematical Problem Solving With the MATH Dataset	arXiv:2103.03874; NeurIPS 2021	Introduced MATH benchmark
2022	X-Risk Analysis for AI Research	arXiv:2206.05862	Conceptual framework for safety research
2023	An Overview of Catastrophic AI Risks	arXiv:2306.12001	Four-bucket risk taxonomy
2023	Natural Selection Favors AIs over Humans	arXiv:2303.16200	Evolutionary argument for AI risk
2024	The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning	arXiv:2403.03218	Introduced WMDP and RMU unlearning
2024	Improving Alignment and Robustness with Circuit Breakers	arXiv:2406.04313	Representation-control defense, with Zou and Kolter
2024	Introduction to AI Safety, Ethics, and Society	Routledge (book)	Textbook based on his ML safety course
2025	Humanity's Last Exam	arXiv:2501.14249; Nature 2025	With Scale AI; introduced HLE

Positions held

Organization	Role	Years
University of Chicago	BS student	2014 to 2018
UC Berkeley	PhD student (advisors Steinhardt and Song)	2018 to 2022
Center for AI Safety	Co-founder, executive director	2022 to present
xAI	Safety advisor (one-dollar salary, no equity)	2023 to present
Schmidt Sciences AI2050	Senior Fellow	2024 to present
Gray Swan AI	Advisor (no equity after July 2024 divestment)	2024 to present
Scale AI	Advisor (twelve-dollar salary, no equity)	November 2024 to present

References

Wikipedia, "Dan Hendrycks," https://en.wikipedia.org/wiki/Dan_Hendrycks ↩
Center for AI Safety, "About Us," https://safe.ai/about ↩
Dan Hendrycks, personal site at UC Berkeley EECS, https://people.eecs.berkeley.edu/~hendrycks/ ↩
Hendrycks, D., and Gimpel, K., "Gaussian Error Linear Units (GELUs)," arXiv:1606.08415, 2016, https://arxiv.org/abs/1606.08415 ↩
Hendrycks, D., and Gimpel, K., "A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks," arXiv:1610.02136, ICLR 2017, https://arxiv.org/abs/1610.02136 ↩
Hendrycks, D., et al., "Measuring Massive Multitask Language Understanding," arXiv:2009.03300, 2020, https://arxiv.org/abs/2009.03300 ↩
Hendrycks, D., et al., "Aligning AI With Shared Human Values," arXiv:2008.02275, ICLR 2021, https://arxiv.org/abs/2008.02275 ↩
Hendrycks, D., et al., "Measuring Mathematical Problem Solving With the MATH Dataset," arXiv:2103.03874, NeurIPS 2021, https://arxiv.org/abs/2103.03874 ↩
Hendrycks, D., et al., "An Overview of Catastrophic AI Risks," arXiv:2306.12001, 2023, https://arxiv.org/abs/2306.12001 ↩
Li, N., Pan, A., Gopal, A., et al., "The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning," arXiv:2403.03218, 2024, https://arxiv.org/abs/2403.03218 ↩
Phan, L., et al., "Humanity's Last Exam," arXiv:2501.14249, 2025, https://arxiv.org/abs/2501.14249 ↩
Wikipedia, "Statement on AI Risk," https://en.wikipedia.org/wiki/Statement_on_AI_Risk ↩
Center for AI Safety, "Statement on AI Risk," https://aistatement.com/ ↩
Wikipedia, "Center for AI Safety," https://en.wikipedia.org/wiki/Center_for_AI_Safety ↩
Wikipedia, "Humanity's Last Exam," https://en.wikipedia.org/wiki/Humanity%27s_Last_Exam ↩
Wikipedia, "MMLU," https://en.wikipedia.org/wiki/MMLU ↩
Wikipedia, "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act" (SB 1047), https://en.wikipedia.org/wiki/Safe_and_Secure_Innovation_for_Frontier_Artificial_Intelligence_Models_Act ↩
*TIME*, "Dan Hendrycks: The 100 Most Influential People in AI 2023," https://time.com/collection/time100-ai/6309050/dan-hendrycks/ ↩
*Fortune*, "Dan Hendrycks, Elon Musk's AI safety advisor, adds role at Scale AI," November 13, 2024, https://fortune.com/2024/11/13/scale-ai-dan-hendrycks-elon-musk-xai-safety-trump-ties/ ↩
Gray Swan AI, "Statement on SB-1047 and Founders," https://www.grayswan.ai/blog/sb1047 ↩
Gray Swan AI, "Gray Swan AI Launch," July 16, 2024, https://www.grayswan.ai/blog/gray-swan-launch ↩
Schmidt Sciences AI2050, "Dan Hendrycks Fellow profile," https://ai2050.schmidtsciences.org/fellow/dan-hendrycks/ ↩
Scale AI, "Scale AI and CAIS Unveil Results of Humanity's Last Exam," https://scale.com/blog/humanitys-last-exam-results ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

Center for AI Safety Circuit Breakers (Representation Rerouting)Existential risk from AI Frontier Security Institute GELU (Gaussian Error Linear Unit)Grok 4 HarmBench Humanity's Last Exam Jacob Steinhardt Jan Leike MATH Level 5 MATH-500 MMLU Non-profit Organizations Representation Engineering SB 1047 (California Safe and Secure Innovation for Frontier Artificial Intelligence Models Act)WMDP benchmark

Who is Dan Hendrycks?

Early life and education

Career

Berkeley research (2018 to 2022)

Why did Hendrycks found the Center for AI Safety?

xAI safety advisor (2023 to present)

Scale AI advisor (2024 to present)

Gray Swan AI advisor (2024 to present)

Research contributions

What is the GELU activation function? (2016)

Out-of-distribution detection (2016 to 2017)

Robustness benchmarks (2019 to 2021)

What is the MMLU benchmark? (2020)

MATH benchmark (2021)

ETHICS benchmark (2020 to 2021)

WMDP benchmark and unlearning (2024)

What is Humanity's Last Exam? (2025)

Catastrophic AI risk writings

What is the Statement on AI Risk? (May 2023)

Policy work

SB 1047 in California (2024)

Federal and international advocacy

Recognition

Selected publications

Positions held

See also

References

Improve this article

Related Articles

Ilya Sutskever

Dario Amodei

Open Philanthropy

Eliezer Yudkowsky

Nick Bostrom

Nate Soares

What links here

Related Articles

Ilya Sutskever

Dario Amodei

Open Philanthropy

Eliezer Yudkowsky

Nick Bostrom

Nate Soares

What links here