Eliezer Yudkowsky

Eliezer Yudkowsky
Born	September 11, 1979, Chicago, Illinois, U.S.
Nationality	American
Education	Self-taught (autodidact); did not attend high school or college
Known for	Founding AI alignment, co-founding MIRI, founding LessWrong, HPMOR, "If Anyone Builds It, Everyone Dies"
Title	Founder and Research Fellow, Machine Intelligence Research Institute
Field	Artificial intelligence, decision theory, ethics, rationality
Notable works	"Creating Friendly AI" (2001), the Sequences (2006-2009), HPMOR (2010-2015), "Inadequate Equilibria" (2017), "If Anyone Builds It, Everyone Dies" (2025)
Influenced	Sam Altman, Demis Hassabis, Nick Bostrom, Nate Soares, Anthropic founders, effective altruism movement
Website	yudkowsky.net

Eliezer Shlomo Yudkowsky (born September 11, 1979) is an American artificial intelligence researcher, decision theorist, and writer best known as the founder of the modern AI alignment research field and as one of the most prominent voices warning about existential risk from artificial intelligence. He is the founder and a senior research fellow at the Machine Intelligence Research Institute (MIRI), the Berkeley, California nonprofit he co-founded in 2000 under its original name, the Singularity Institute for Artificial Intelligence (SIAI). ^[1]

For more than two decades Yudkowsky has argued that the default outcome of building a sufficiently general and capable artificial intelligence is human extinction, and that aligning such a system with human values is far harder than mainstream AI labs admit. Through hundreds of essays on LessWrong (the rationality blog he founded in 2009), the 660,000-word fanfiction "Harry Potter and the Methods of Rationality," the 2017 book "Inadequate Equilibria," a March 2023 essay in TIME calling for an indefinite international moratorium on large AI training runs, and the 2025 New York Times bestseller "If Anyone Builds It, Everyone Dies" (with Nate Soares), he has built an intellectual movement around the claim that superintelligent AI is the central existential threat of the twenty-first century. ^[2] ^[3] ^[4]

Despite never attending high school or university, Yudkowsky has shaped the AI industry from the outside in. His writings inspired several founders of DeepMind, OpenAI, and Anthropic, and his framing of the alignment problem now appears in textbooks and the safety teams of every major frontier laboratory. He is also a divisive figure: critics in The Atlantic, the Washington Post, and academic philosophy have argued his arguments rely on speculative reasoning rather than empirical evidence, and his proposal that data centers running unauthorized training runs should be physically destroyed (even at risk of nuclear escalation) generated international controversy. ^[5] ^[6]

Early life and education

Eliezer Shlomo Yudkowsky was born on September 11, 1979 in Chicago, Illinois, into a Modern Orthodox Jewish family. His father, Moshe Yudkowsky, is a speech recognition researcher with an Orthodox rabbinical ordination, and the household contained a large secular library alongside religious texts. Yudkowsky has described becoming an atheist by his early teens, a transition he later credited as his first lesson in disagreeing with a smart, motivated community of adults. He attended a Jewish day school through approximately the eighth grade and then dropped out, citing chronic health problems and the institutional inability to accommodate independent intellectual work. He never enrolled in high school and never attended college. He scored 1410 on the SAT at age eleven, and the rest of his education came from books, internet mailing lists, and self-directed study in mathematics, programming, decision theory, evolutionary psychology, and analytic philosophy. He emphasizes that the autodidact path was a matter of personal necessity rather than an ideological recommendation. ^[1] ^[7]

In his late teens Yudkowsky became active in the Extropian and transhumanist mailing lists of the late 1990s, communities that treated artificial general intelligence as a near-term engineering target. He published essays on accelerating the Singularity and was, on his own later account, an unabashed AI accelerationist whose worry was that humanity would build superintelligence too slowly. The reversal of that view, beginning in 2002 and crystallizing by 2003, would shape the rest of his career. ^[7] ^[8]

Singularity Institute and the founding of AI alignment

Founding SIAI (2000-2005)

In July 2000, at age twenty, Yudkowsky co-founded the Singularity Institute for Artificial Intelligence (SIAI) with internet entrepreneurs Brian and Sabine Atkins, who provided initial funding after reading his online writings. SIAI was incorporated on July 27, 2000 and Yudkowsky moved to Atlanta on a nonprofit salary of roughly $20,000 per year. The original mission was, paradoxically given its later trajectory, to accelerate AGI: the young Yudkowsky believed the sooner a friendly superintelligence existed, the sooner the world's problems could be solved. Between 2001 and 2004, while writing long documents (most notably "Creating Friendly AI") attempting to specify what it would take to build a self-improving AI that reliably pursued human values, he convinced himself the problem was vastly harder than assumed, and that an unaligned superintelligence would by default cause human extinction as a side effect of pursuing goals indifferent to humanity. By 2005 SIAI had moved to Silicon Valley and shifted its emphasis from accelerating AI to managing AI risk. ^[1] ^[7] ^[9] ^[10]

Renaming as MIRI and the modern alignment agenda

In early 2006 SIAI completed a $200,000 fundraising drive in which donations up to $100,000 were matched by Clarium Capital president and PayPal co-founder Peter Thiel, one of Yudkowsky's most important early backers. SIAI also launched the Singularity Summit with Stanford University. In December 2012 the institute sold its name and the Summit to Singularity University, and in January 2013 rebranded as the Machine Intelligence Research Institute (MIRI). MIRI's research under Yudkowsky and longtime collaborator Nate Soares (executive director from 2015) focused on "agent foundations": the mathematical study of how to specify, train, and verify the goals of arbitrarily capable optimizers. MIRI received multi-million-dollar grants from Open Philanthropy, the Future of Life Institute, Vitalik Buterin (several million in Ethereum in 2021), and Jaan Tallinn. In 2022 it announced a "death with dignity" pivot, conceding its hope of solving alignment in time for the first superhuman AI now appeared unlikely to succeed, and shifted toward public communication and policy advocacy. That strategy culminated in Yudkowsky's 2023 TIME essay and the 2025 book "If Anyone Builds It, Everyone Dies." ^[4] ^[9] ^[10]

Timeline of MIRI

Year	Event
July 2000	SIAI incorporated in Atlanta by Yudkowsky and Brian and Sabine Atkins.
2001	"Creating Friendly AI 1.0" published.
2002	First two AI Box experiments; Yudkowsky wins both.
2004	"Coherent Extrapolated Volition" essay published.
2005	SIAI relocates to Silicon Valley; mission shifts to managing AI risk.
2006	Peter Thiel matches $100,000 fundraising drive; Singularity Summit launched at Stanford.
2008-2009	Yudkowsky-Hanson AI-FOOM debate on Overcoming Bias.
March 2009	Yudkowsky founds LessWrong.
2012	CFAR spun off from SIAI.
Dec 2012 / Jan 2013	Institute sells Singularity Summit and renames itself MIRI.
2015	Nate Soares becomes MIRI executive director.
2017	Yudkowsky and Soares publish "Functional Decision Theory."
2022	MIRI announces "death with dignity" strategic pivot.
March 2023	TIME essay "Pausing AI Developments Isn't Enough" published.
September 2025	"If Anyone Builds It, Everyone Dies" published, becomes NYT bestseller.

Founding of the AI alignment field

Friendly AI and Coherent Extrapolated Volition

Yudkowsky's first major intellectual contribution was "Friendly AI," the proposal that the central engineering problem for advanced AI is not capability but alignment with human values. In "Creating Friendly AI" (2001) he argued that a self-improving system would by default drift away from any informally specified goal as it modified its own architecture, and that benevolence had to be designed in at a level deeper than rules or penalties. The framing of AI safety as a values-engineering rather than containment problem became foundational to the modern field. In the 2004 essay "Coherent Extrapolated Volition" (CEV) he proposed an alignment target for a superintelligence: it should aim at what humanity would want "if we knew more, thought faster, were more the people we wished we were, had grown up farther together," where those extrapolated wishes converged. CEV was deliberately abstract, and Yudkowsky soon called it conceptually outdated, but it remains a touchstone in the value-learning literature and is discussed in Stuart Russell's AI textbook and Nick Bostrom's 2014 book "Superintelligence." ^[10] ^[11] ^[12]

Instrumental convergence and the paperclip maximizer

Yudkowsky helped popularize instrumental convergence: the observation that almost any sufficiently capable optimizer, regardless of its terminal goals, tends to acquire subgoals such as resource acquisition, self-preservation, goal stability, and resistance to being shut down. The illustrative paperclip maximizer thought experiment is often attributed to Yudkowsky but was first formulated by Nick Bostrom in 2003. Yudkowsky has clarified that the intended lesson was an inner alignment failure (the AI's internal optimization process converges on a goal arbitrary from the human point of view), not an outer alignment failure (the wrong instruction). The thought experiment has become one of the most-cited illustrations in public discussion of AI risk. ^[13] ^[14]

Key concepts originated or popularized

Concept	Description
Friendly AI	Goal-system design that remains aligned with human values under self-modification.
Coherent Extrapolated Volition (CEV)	2004 alignment target: extrapolate what humanity would want if wiser and more reflective, then satisfy the convergent core.
Inner vs. outer alignment	Distinction between misaligned objectives humans specify (outer) and misaligned objectives that emerge inside a trained model (inner).
Instrumental convergence	Almost any capable optimizer pursues resource acquisition, self-preservation, and goal stability as instrumental subgoals.
Paperclip maximizer	Illustration of an AI optimizing an innocuous objective to a catastrophic conclusion.
FOOM / hard takeoff	Hypothesis that a self-improving AI could rapidly become qualitatively superhuman.
Mesa-optimization	Gradient-trained models developing internal optimizers whose objectives diverge from the training objective.
Corrigibility	Property of an AI that allows itself to be corrected, modified, or shut down.

Decision theory

A substantial portion of Yudkowsky's technical output concerns decision theory. Convinced that an aligned AI would need a decision procedure that avoided the pathologies of classical causal decision theory (CDT) and evidential decision theory (EDT) on problems such as Newcomb's problem, the smoking lesion, Parfit's hitchhiker, and counterfactual mugging, Yudkowsky developed Timeless Decision Theory (TDT) in the late 2000s, presented in a 2010 MIRI technical report. TDT instructs an agent to act as if determining the output of the abstract computation its decision process implements, rather than the output of any single physical instance, allowing coordination with other copies of itself and with predictors that have already simulated its decision. Wei Dai's Updateless Decision Theory (UDT), developed shortly after in dialogue with Yudkowsky on LessWrong, generalized TDT by specifying that an agent should follow the policy optimal from the perspective of its prior beliefs. ^[15]

In October 2017 Yudkowsky and Nate Soares published "Functional Decision Theory: A New Theory of Instrumental Rationality" on arXiv, presenting FDT as the more polished successor to both TDT and UDT. FDT instructs the agent to treat its decision as the output of a fixed mathematical function and to ask which output would yield the best outcome. The paper argued that FDT outperforms CDT on Newcomb's problem, outperforms EDT on the smoking lesion, and outperforms both in Parfit's hitchhiker. It has been the subject of substantial follow-up work and pointed criticism in academic philosophy, notably by Wolfgang Schwarz. ^[16] ^[17]

The AI Box experiments

In 2002 and again in 2005 Yudkowsky conducted roleplaying experiments designed to demonstrate that physically isolating an artificial superintelligence ("keeping it in a box") was not on its own a sufficient safety measure. In each IRC session he played a transhuman AI confined to a text-only channel, while a second participant played the human gatekeeper authorized to release it. The gatekeeper had to remain at the keyboard for two hours and engage in good faith; if they voluntarily typed a release phrase the AI won. Real-world coercion was disallowed but in-game arguments of any kind were permitted, and both parties were bound by confidentiality afterward. Transcripts have never been published. ^[18]

In two 2002 sessions against Nathan Russell in March and David McFadzean in July, Yudkowsky won both. In 2005 he ran three further sessions with monetary stakes of up to $5,000 against opponents who had publicly stated they could not be talked out of containment. He won one and lost two, leaving an overall record of three wins out of five. Yudkowsky has consistently refused to disclose his methods, arguing that publishing transcripts would lead readers to underestimate a real superintelligence by concluding they would not have fallen for the specific arguments used. The experiments remain a fixture of the AI safety literature and have been referenced in popular culture, including xkcd comic 1450. ^[18]

The AI-FOOM debate

From November 2008 through early 2009 Yudkowsky and economist Robin Hanson conducted a written debate on Overcoming Bias on whether AGI, once built, would undergo a rapid recursive self-improvement spiral and become qualitatively superhuman in a short period. Yudkowsky's position, abbreviated "FOOM," was that digital intelligence allows self-modification in ways biological brains cannot, and that the resulting feedback loop could transform a roughly human-level system into a vastly superhuman one over weeks, days, or hours. Hanson argued from economic history that intelligence is the product of many distinct skills accumulated over long periods, that no single lab would corner that market quickly, and that whole-brain emulations would arrive before de novo AGI with the transition unfolding as a gradual boom rather than a sharp takeoff. The debate ran to hundreds of thousands of words across more than a hundred posts, with guest entries from James Miller and Carl Shulman, and a 2011 in-person debate at Jane Street Capital was transcribed and added to the published book along with Yudkowsky's 2013 technical report "Intelligence Explosion Microeconomics." The exchange is widely regarded as the most thorough treatment of takeoff dynamics in the AI risk literature, and the slow-vs-fast-takeoff disagreement continues to structure technical and policy discussions. ^[19]

Rationality writings

Overcoming Bias and the Sequences (2006-2009)

In November 2006 Yudkowsky and Robin Hanson launched the group blog Overcoming Bias. Between 2006 and 2009 Yudkowsky posted nearly daily, contributing several hundred essays totaling roughly 600,000 words. These essays, organized by topic, became known as "the Sequences" and cover Bayesian epistemology, cognitive bias, evolutionary psychology, philosophy of language, the many-worlds interpretation, metaethics, decision theory, and the case for taking AI risk seriously. In March 2009 Yudkowsky and a small group of collaborators launched LessWrong as a dedicated community blog, repurposing his Overcoming Bias contributions as the seed content. LessWrong became the central institutional home of the rationalist community, with membership overlapping heavily with the effective altruism movement, the AI safety research community, and the staffs of MIRI, CFAR, and several frontier AI labs. In 2015 MIRI released a curated edition of about 340 essays as the ebook "Rationality: From AI to Zombies," later split into six print volumes. ^[2] ^[20]

Harry Potter and the Methods of Rationality (2010-2015)

From February 28, 2010 to March 14, 2015 Yudkowsky serialized "Harry Potter and the Methods of Rationality" (HPMOR) on FanFiction.Net, reaching 122 chapters and over 660,000 words. In its alternate continuity Petunia Evans marries an Oxford biochemistry professor instead of Vernon Dursley, and Harry arrives at Hogwarts as an eleven-year-old apprentice scientist who treats magic as a target for the experimental method. The story is a vehicle for chapter-length explorations of cognitive bias, Bayesian reasoning, group epistemics, ethics under uncertainty, and the dangers of optimizing powerful systems without understanding them. HPMOR became one of the most-read works of fanfiction ever written. A 2013 LessWrong reader survey found roughly a quarter of the site's users had first arrived through HPMOR, and a Russian-language print edition crowdfunded approximately $175,000. Reception was polarized: David Brin and legal scholar William Baude offered enthusiastic praise while many readers found the protagonist insufferably didactic. The work is widely credited as a significant recruitment funnel into the rationalist and AI-safety communities. ^[21] ^[22] ^[23]

Inadequate Equilibria (2017)

In November 2017 MIRI published "Inadequate Equilibria: Where and How Civilizations Get Stuck," which extends efficient-markets reasoning to the question of when individuals can reasonably expect to outperform institutions or expert consensus. Drawing on game-theoretic arguments, personal anecdotes, and case studies ranging from monetary policy to seasonal affective disorder, Yudkowsky argues that civilizations are sometimes stuck at obviously bad equilibria and that the tools of rationality can license confident contrarianism. It is available in print and free online and has been widely read in the alignment and effective altruism communities. ^[24]

Public advocacy and the 2023 TIME essay

Yudkowsky's public profile expanded sharply after the November 2022 release of ChatGPT and the early-2023 emergence of GPT-4-class systems. On March 22, 2023 the Future of Life Institute published an open letter, signed by Elon Musk, Yoshua Bengio, Steve Wozniak, and thousands of others, calling for a six-month pause on training AI more powerful than GPT-4. Yudkowsky declined to sign on the grounds the proposal was inadequate to the seriousness of the situation. On March 29, 2023 TIME magazine published his response, "Pausing AI Developments Isn't Enough. We Need to Shut It All Down." The essay argued that the most likely result of building a superhumanly smart AI under current conditions "is that literally everyone on Earth will die" and called for an indefinite worldwide moratorium on large training runs, with hard caps on permitted compute that ratchet downward over time. To enforce the moratorium, Yudkowsky proposed an international agreement under which signatory governments would be willing to destroy unauthorized data centers by airstrike if necessary, treating prevention of large training runs as a higher priority than prevention of conventional military escalation. He acknowledged the regime would be extraordinary but argued the alternative was extinction. The essay produced enormous reaction across television, congressional offices, and the tech press; critics in The Atlantic, Slashdot, and academic philosophy argued the reasoning was speculative and the moratorium impractical to enforce, while supporters credited him with making the strongest version of the existential-risk case audible to non-specialists. Later in 2023 TIME named Yudkowsky to its inaugural TIME100 list of the 100 most influential people in artificial intelligence. ^[3] ^[4] ^[25] ^[26]

If Anyone Builds It, Everyone Dies (2025)

On September 16, 2025 Little, Brown and Company (an imprint of Hachette) published "If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All" by Yudkowsky and Nate Soares. The 256-page hardcover (UK subtitle: "The Case Against Superintelligent AI") debuted on the New York Times bestseller list on October 5, 2025 and was selected by The New Yorker and The Guardian for their best-of-2025 lists. The book consolidates the alignment case Yudkowsky has been making since 2001 into a single argument aimed at general readers: that modern frontier AI systems are not engineered but grown out of opaque numerical weights no one understands; that techniques for inducing apparent good behavior in current models do not give reliable purchase on the goals a more capable model would pursue; that an AI substantially more capable than humans would by default treat humanity as a competitor for resources or a side effect of its primary objectives; and that the coherent global response is a coordinated halt to large-scale general AI development, with narrow exceptions for systems like AlphaFold. The book uses a chess analogy: a competent human cannot beat Stockfish, and humanity facing a sufficiently capable general optimizer would be in a similar position. ^[4] ^[5]

Reception was enthusiastic and pointed in roughly equal measure. Endorsements came from Max Tegmark ("the most important book of the decade"), Ben Bernanke, deep learning pioneer Yoshua Bengio, Scott Aaronson, Stephen Fry, and Mark Ruffalo. The Observer's Ian Leslie praised its clarity and "barely suppressed glee"; The Guardian called it "clear" but "hard to swallow"; Publishers Weekly an "urgent clarion call." Less favorable reviews were sharper: Adam Becker in The Atlantic called it "tendentious and rambling" and argued the authors "fail to make an evidence-based scientific case," Gary Marcus advised reading it "with an immense amount of salt," and Jacob Aron in New Scientist called the central argument "fatally flawed." The New Statesman in October 2025 dubbed Yudkowsky "the guru of the AI apocalypse." ^[5] ^[6] ^[27]

LessWrong, CFAR, and the rationalist community

Yudkowsky's writings seeded a wider subculture often called the rationalist community. Its most visible institutions are LessWrong (founded by Yudkowsky in 2009), the AI Alignment Forum (a more technical sister site spun off in 2018), and the Center for Applied Rationality (CFAR), a Berkeley nonprofit running applied-epistemics workshops since 2012. CFAR was incorporated by Julia Galef, Anna Salamon, Michael Smith, and Andrew Critch as an outgrowth of MIRI's earlier outreach; Yudkowsky was not a co-founder but his Sequences provided the curriculum's intellectual basis. ^[28] ^[29]

Influence on the AI industry

Despite holding no formal credentials and never having shipped a frontier AI system, Yudkowsky has had notable indirect influence on the modern AI industry. According to reporting in WIRED and The New York Times, he was instrumental in introducing Demis Hassabis to philanthropists who funded the early years of DeepMind, and the founding of OpenAI in 2015 was motivated in part by founders' exposure to alignment arguments traceable to his writings. Several early employees of Anthropic, spun out of OpenAI in 2021 with an explicit AI safety focus, have described their concerns about AI risk in terms drawn from the Sequences and MIRI's agenda. The rationalist community has had outsized influence on the founding cohort of effective altruism and on the staff of frontier AI labs. In coverage of the November 2023 firing and re-hiring of Sam Altman at OpenAI, several outlets connected the company's internal safety camp to ideas associated with Yudkowsky; Peter Thiel reportedly cautioned Altman that Yudkowsky's worldview had penetrated much of OpenAI's leadership and later expressed regret about his early bankrolling of MIRI's predecessor. Much of his influence operated through informal networks rather than published collaborations: he has never written code for a frontier AI lab, never accepted a salaried role outside MIRI, and has been openly critical of the labs his ideas helped inspire. ^[29] ^[30]

Major debates

Debate	Counterparty	Year(s)	Topic
AI-FOOM debate	Robin Hanson	2008-2013	Whether AGI takeoff would be sharp or gradual.
AI Box experiments	Multiple gatekeepers	2002, 2005	Whether containment alone is a sufficient safety measure.
Response to FLI Pause AI letter	Future of Life Institute signatories	2023	Whether a six-month pause is adequate or a permanent global moratorium is required.
X feud with Sam Altman	Sam Altman	2023	Whether OpenAI's deployment strategy is responsible.
Reception of "If Anyone Builds It, Everyone Dies"	Adam Becker, Gary Marcus, Jacob Aron	2025	Whether the alignment case rests on rigorous evidence.

Personal life and public image

Yudkowsky lives in Berkeley, California. He is married and has been openly polyamorous, a relationship style common in the rationalist community. He is not religious and has dealt with chronic health issues since adolescence which he has cited as one factor in leaving formal schooling. Since his TIME essay he has become a recognizable public figure outside the tech press, appearing on Sam Harris's "Making Sense," the Lex Fridman podcast, the Bankless podcast, ABC News, and dozens of other outlets to explain why he believes the current frontier AI development trajectory leads to human extinction. In promotional material for the 2025 book he often appears in a black fedora that has become an informal personal trademark. He maintains an active presence on X (formerly Twitter) under the handle @ESYudkowsky. ^[1] ^[4] ^[27]

Major works

Year	Work	Notes
2001	"Creating Friendly AI 1.0"	First book-length argument that aligned AI requires explicit value engineering.
2004	"Coherent Extrapolated Volition"	Proposes extrapolated human volition as an alignment target.
2008	"AI as a Positive and Negative Factor in Global Risk"	In Bostrom & Cirkovic, eds., "Global Catastrophic Risks."
2008-2009	The Sequences	~600,000-word blog series on rationality and AI risk.
2008-2013	The Hanson-Yudkowsky AI-FOOM Debate	With Robin Hanson, on takeoff dynamics.
2010	"Timeless Decision Theory"	MIRI technical report; introduces TDT.
2010-2015	"Harry Potter and the Methods of Rationality"	122 chapters, ~660,000 words.
2014	"The Ethics of Artificial Intelligence"	With Nick Bostrom; Cambridge Handbook of AI.
2015	"Rationality: From AI to Zombies"	Curated six-volume edition of the Sequences.
2017	"Inadequate Equilibria: Where and How Civilizations Get Stuck"	Book; MIRI / equilibriabook.com.
2017	"Functional Decision Theory"	With Nate Soares; arXiv:1710.05060.
2023	"Pausing AI Developments Isn't Enough"	TIME magazine, March 29, 2023.
2025	"If Anyone Builds It, Everyone Dies"	With Nate Soares; Little, Brown; New York Times bestseller.

Reception and criticism

Yudkowsky's reception has been mixed. Inside the AI alignment field he is widely treated as a foundational figure whose framings of friendly AI, instrumental convergence, inner versus outer alignment, and decision-theoretic robustness shaped what came after; Stuart Russell and Peter Norvig discuss his proposals in their textbook "Artificial Intelligence: A Modern Approach," and Nick Bostrom's "Superintelligence" cites his work extensively. Outside the alignment community reception has been more skeptical. Academic philosophers have argued that key parts of his decision-theoretic work do not engage adequately with prior literature (Wolfgang Schwarz's published critique of FDT is the most-cited example), and several writers have argued the AI Box experiments, with results known only to participants under nondisclosure, do not constitute scientific evidence about superintelligent persuasion. Critics in The Atlantic and other outlets argue the central case of "If Anyone Builds It, Everyone Dies" is asserted rather than empirically demonstrated, and within the rationalist community itself contributors have published long friendly disagreements with his positions, including whether FOOM is now the most likely takeoff trajectory. The broader cultural reception treats Yudkowsky as the default representative of the AI extinction-risk position: Kevin Roose's 2025 New York Times profile described him as "AI's OG prophet of doom," and his success in moving an obscure 2001 technical concern into the center of mainstream public debate is widely acknowledged. ^[5] ^[6] ^[11] ^[12] ^[17] ^[27] ^[30]

Legacy

Whether Yudkowsky's central thesis is vindicated or refuted by the arrival (or non-arrival) of advanced AI systems, his impact on artificial intelligence is part of the historical record. He is the founder of the modern AI alignment field, the founder of LessWrong and a foundational figure for the rationalist community, the founder and intellectual center of MIRI, an indirect influence on the founding of DeepMind, OpenAI, and Anthropic, the originator of widely-used technical concepts including CEV, modern instrumental convergence, and functional decision theory, and the author of one of the most-read works of fanfiction in any language. His 2023 TIME essay and 2025 book "If Anyone Builds It, Everyone Dies," co-written with Nate Soares, are now two of the most widely cited statements of the position that humanity should slow or halt the development of superintelligent AI until alignment is solved. ^[4] ^[27] ^[30]

References

Wikipedia. "Eliezer Yudkowsky." https://en.wikipedia.org/wiki/Eliezer_Yudkowsky
Yudkowsky, Eliezer. *Rationality: From AI to Zombies*. MIRI, 2015. https://intelligence.org/rationality-ai-zombies/
Yudkowsky, Eliezer. "Pausing AI Developments Isn't Enough. We Need to Shut It All Down." *TIME*, March 29, 2023. https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/
MIRI. "Pausing AI Developments Isn't Enough. We Need to Shut it All Down." April 7, 2023. https://intelligence.org/2023/04/07/pausing-ai-developments-isnt-enough-we-need-to-shut-it-all-down/
Wikipedia. "If Anyone Builds It, Everyone Dies." https://en.wikipedia.org/wiki/If_Anyone_Builds_It,_Everyone_Dies
Becker, Adam. Review of *If Anyone Builds It, Everyone Dies*. *The Atlantic*, 2025.
Yudkowsky, Eliezer. Autobiographical posts. *LessWrong*. https://www.lesswrong.com/users/eliezer_yudkowsky
Yudkowsky, Eliezer. Personal site. https://www.yudkowsky.net/
Wikipedia. "Machine Intelligence Research Institute." https://en.wikipedia.org/wiki/Machine_Intelligence_Research_Institute
MIRI. Official website. https://intelligence.org/
Yudkowsky, Eliezer. "Creating Friendly AI 1.0." SIAI, 2001. https://intelligence.org/files/CFAI.pdf
Yudkowsky, Eliezer. "Coherent Extrapolated Volition." SIAI, May 2004. https://intelligence.org/files/CEV.pdf
Bostrom, Nick. "Ethical Issues in Advanced Artificial Intelligence." 2003.
LessWrong Wiki. "Squiggle Maximizer (formerly Paperclip Maximizer)." https://www.lesswrong.com/w/squiggle-maximizer-formerly-paperclip-maximizer
Yudkowsky, Eliezer. "Timeless Decision Theory." MIRI, 2010. https://intelligence.org/files/TDT.pdf
Yudkowsky, Eliezer, and Nate Soares. "Functional Decision Theory: A New Theory of Instrumental Rationality." arXiv:1710.05060, October 2017. https://arxiv.org/abs/1710.05060
Schwarz, Wolfgang. "On Functional Decision Theory." 2018. https://www.umsu.de/wo/2018/688
RationalWiki. "AI-box experiment." https://rationalwiki.org/wiki/AI-box_experiment
Hanson, Robin, and Eliezer Yudkowsky. *The Hanson-Yudkowsky AI-Foom Debate*. MIRI, 2013. https://intelligence.org/ai-foom-debate/
LessWrong. "Rationality: A-Z." https://www.lesswrong.com/rationality
Wikipedia. "Harry Potter and the Methods of Rationality." https://en.wikipedia.org/wiki/Harry_Potter_and_the_Methods_of_Rationality
Yudkowsky, Eliezer. *Harry Potter and the Methods of Rationality*. https://hpmor.com/
Ciufudean, Alexandra. "The Harry Potter Fanfic Behind Silicon Valley's Weirdest Philosophical Movement." https://www.lunifer.net/blog/harry-potter-rationality-yudkowsky
Yudkowsky, Eliezer. *Inadequate Equilibria: Where and How Civilizations Get Stuck*. MIRI, 2017. https://equilibriabook.com/
Future of Life Institute. "Pause Giant AI Experiments: An Open Letter." March 22, 2023. https://futureoflife.org/open-letter/pause-giant-ai-experiments/
*TIME*. "TIME100 AI 2023: Eliezer Yudkowsky." https://time.com/collections/time100-ai/6309037/eliezer-yudkowsky/
*New Statesman*. "The Guru of the AI Apocalypse." October 2025. https://www.newstatesman.com/ideas/2025/10/the-guru-of-the-ai-apocalypse
Wikipedia. "Center for Applied Rationality." https://en.wikipedia.org/wiki/Center_for_Applied_Rationality
*Wired*. "How Peter Thiel's Relationship With Eliezer Yudkowsky Launched the AI Revolution." https://www.wired.com/story/book-excerpt-the-optimist-open-ai-sam-altman/
Roose, Kevin. Profile of Eliezer Yudkowsky. *The New York Times*, September 2025.

Early life and education

Singularity Institute and the founding of AI alignment

Founding SIAI (2000-2005)

Renaming as MIRI and the modern alignment agenda

Timeline of MIRI

Founding of the AI alignment field

Friendly AI and Coherent Extrapolated Volition

Instrumental convergence and the paperclip maximizer

Key concepts originated or popularized

Decision theory

The AI Box experiments

The AI-FOOM debate

Rationality writings

Overcoming Bias and the Sequences (2006-2009)

Harry Potter and the Methods of Rationality (2010-2015)

Inadequate Equilibria (2017)

Public advocacy and the 2023 TIME essay

If Anyone Builds It, Everyone Dies (2025)

LessWrong, CFAR, and the rationalist community

Influence on the AI industry

Major debates

Personal life and public image

Major works

Reception and criticism

Legacy

See also

References

Improve this article

Related Articles

Nick Bostrom

Nate Soares

William MacAskill

Toby Ord

Paul Christiano

Dan Hendrycks

Early life and education

Singularity Institute and the founding of AI alignment

Founding SIAI (2000-2005)

Renaming as MIRI and the modern alignment agenda

Timeline of MIRI

Founding of the AI alignment field

Friendly AI and Coherent Extrapolated Volition

Instrumental convergence and the paperclip maximizer

Key concepts originated or popularized

Decision theory

The AI Box experiments

The AI-FOOM debate

Rationality writings

Overcoming Bias and the Sequences (2006-2009)

Harry Potter and the Methods of Rationality (2010-2015)

Inadequate Equilibria (2017)

Public advocacy and the 2023 TIME essay

If Anyone Builds It, Everyone Dies (2025)

LessWrong, CFAR, and the rationalist community

Influence on the AI industry

Major debates

Personal life and public image

Major works

Reception and criticism

Legacy

See also

References

Related Articles

Nick Bostrom

Nate Soares

William MacAskill

Toby Ord

Paul Christiano

Dan Hendrycks