Nate Soares
Last reviewed
Apr 28, 2026
Sources
32 citations
Review status
Source-backed
Revision
v1 ยท 3,289 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Apr 28, 2026
Sources
32 citations
Review status
Source-backed
Revision
v1 ยท 3,289 words
Add missing citations, update stale details, or suggest a clearer explanation.
Nate Soares (sometimes credited as Nathan Soares) is an American computer scientist and AI safety researcher who serves as president of the Machine Intelligence Research Institute (MIRI), a nonprofit research organization in Berkeley, California, focused on the technical and policy challenges of AI alignment and existential risk from advanced AI.[1][2] Born in 1989, Soares trained as a software engineer and worked at Microsoft and Google before joining MIRI as a research fellow in 2014. He became MIRI's executive director in May 2015, succeeding Luke Muehlhauser, and transitioned to the title of president in 2023 when long-time chief operating officer Malo Bourgon assumed the role of CEO.[3][4]
Soares is the author of a substantial body of technical writing on decision theory, value learning, and the agent-foundations approach to alignment, including the 2014 paper that introduced the term "AI alignment" into wider research use, the 2016 paper Logical Induction (with Scott Garrabrant and others), and the 2017 paper Functional Decision Theory (with Eliezer Yudkowsky).[5][6][7] In 2025 he co-authored If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All with Yudkowsky, which became a New York Times bestseller and brought MIRI's pessimistic view of superintelligence to a mainstream audience.[8][9]
Under Soares's leadership, MIRI has shifted strategic focus several times: from foundational mathematical research in the mid-2010s, to a partly nondisclosed research program from 2017 onward, and from 2022 to a public communications and policy stance organized around an international moratorium on training larger AI systems.[10][11] Soares argues that current methods for building general-purpose AI are extremely unlikely to produce systems humans can control, and he has stated probability estimates for AI-caused human extinction in the high double digits.[9][12]
Soares was born in 1989 and completed a Bachelor of Science in computer science and economics at George Washington University in 2011.[2] During and immediately after his undergraduate studies he held technical positions in government and government-adjacent work: a research-associate role at the National Institute of Standards and Technology, where he wrote software to help architects upgrade building designs to meet stricter energy standards, and a contracting role for the U.S. Department of Defense, building software tools used by the National Defense University.[2]
He then moved into the technology industry. At Microsoft he was a software development engineer on Office 2013, working on user-interface details including the status bar, message bar visuals, high-contrast and high-DPI display modes, icon and media rendering, and parts of the ribbon interface.[2] He subsequently joined Google as a software engineer on Google Compute Engine and on tooling that integrated the Vim text editor with Google's internal development environment.[2]
While at Google, Soares grew increasingly interested in the technical and philosophical challenges of building general-purpose AI. He attended his first MIRI workshop in December 2013, was offered a position shortly afterward, and left Google in 2014 to join MIRI full time.[3][4]
Soares joined MIRI as a research fellow in 2014. The institute, then under executive director Luke Muehlhauser, was formalizing its "agent foundations" research agenda: a set of mathematical and decision-theoretic problems that MIRI argued any sufficiently capable AI system would have to solve in order to behave reliably.[5][13]
In his first year, Soares authored or co-authored about a dozen papers, six of them written specifically for MIRI's technical research agenda.[3] He also produced the institute's research guide for prospective alignment researchers, attended five external conferences, and presented at three of them, including a workshop at Purdue University in September 2014 and the Future of Life Institute's Puerto Rico conference in January 2015.[14] Two of his early collaborations with Benja Fallenstein, Aligning Superintelligence with Human Interests: A Technical Research Agenda (2014), which helped popularize the term "AI alignment," and Toward Idealized Decision Theory (2014), set the framing for years of subsequent MIRI work.[5][15]
MIRI announced in May 2015 that Soares would succeed Muehlhauser as executive director, with the transition taking effect on 31 May 2015.[3][4] Muehlhauser, who had run the institute since 2011, left to take a research position at GiveWell. Soares said his immediate priority was expanding the research team and announced that Jessica Taylor would join MIRI full time in August 2015.[3] He had previously co-founded startups, run a F.I.R.S.T. Robotics team, and served as president of an Entrepreneur's Club, but he described the MIRI role as his first formal professional leadership position.[4]
During Soares's first three years as executive director, MIRI concentrated on the foundational mathematical agenda set out in its 2014 and 2015 technical reports. Major outputs included the 2015 AAAI workshop paper Corrigibility, co-authored with Fallenstein, Yudkowsky, and Stuart Armstrong, which formalized the problem of building AI systems that cooperate with attempts to correct or shut them down, and the 2016 Logical Induction paper, with Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, and Jessica Taylor, which proposed a computable algorithm for assigning probabilities to logical sentences.[6][16]
In 2017 Soares and Yudkowsky published Functional Decision Theory: A New Theory of Instrumental Rationality, which formally described a decision theory that treats an agent's choice as the output of a fixed mathematical function rather than as a one-off causal event. They argued that functional decision theory outperforms causal and evidential decision theory on a range of problems involving prediction, commitment, and self-similar agents.[7] Soares also published The Value Learning Problem (2015, revised 2018), surveying approaches to building AI systems that infer rather than execute human values.[17]
In April 2017 MIRI published a strategy update announcing a shift toward exploratory directions that it did not plan to share publicly until late 2017 at the earliest. By November 2018, the institute formalized this stance: most internal research would be "nondisclosed by default," with outputs released only when there was a specific reason to publish.[10] Soares argued that some MIRI research was sufficiently close to capabilities work that the marginal harm of accelerating timelines outweighed the marginal benefit of public scrutiny.[10] In December 2020 Soares wrote in MIRI's annual update that the institute's 2017 research push "has, at this point, largely failed, in the sense that neither Eliezer nor I have sufficient hope in it for us to continue focusing our main efforts there."[10] During this period MIRI continued to publish smaller technical pieces and host the Risks from Learned Optimization in Advanced Machine Learning Systems paper (Hubinger, van Merwijk, Mikulik, Skalse, Garrabrant, 2019), which introduced mesa-optimization into the inner alignment literature.[18]
In April 2022 Yudkowsky published the LessWrong post "MIRI announces new 'Death With Dignity' strategy," which framed the institute's near-term work as raising the probability that humanity would "die with slightly more dignity" given his low estimated odds of solving alignment in time. Soares endorsed the post's underlying analysis in later commentary, arguing that the most useful alignment work might involve waiting for human cognitive enhancement to make the problems tractable.[11][19]
A more concrete shift was announced in October 2023, when MIRI made public a leadership reshuffle that had been piloted in February 2023 and finalized in June 2023. Bourgon, the long-serving COO, became CEO; Soares moved from executive director to president; Yudkowsky became chair of the board; Alex Vermeer became COO; Jimmy Rintjema became CFO; and Lisa Thiergart and Gretta Duleba joined as research and communications managers.[20] The post stated that "policy and communications will be a higher priority for MIRI than research going forward."[20]
The January 2024 MIRI 2024 Mission and Strategy Update organized this position around three priorities: international policy work aimed at halting development of frontier general-purpose AI until alignment is better understood, communications aimed at making the institute's analysis legible to non-technical audiences, and a reduced but ongoing technical research program supporting the first two priorities.[21] In 2025 the institute's communications work culminated in Soares and Yudkowsky's book If Anyone Builds It, Everyone Dies.[8][9]
Soares's published research falls into several overlapping clusters.
The 2014 paper Aligning Superintelligence with Human Interests: A Technical Research Agenda, co-written with Fallenstein, set out the agent-foundations program that defined MIRI's direction for most of the 2010s. Its companion Annotated Bibliography (2014) organized then-existing AI safety literature, and the 2015 update Agent Foundations for Aligning Machine Intelligence with Human Interests re-framed the same broad program.[5][22]
Toward Idealized Decision Theory (Soares and Fallenstein, 2014) argued that existing causal and evidential decision theories failed in environments where an agent's choices were predictable in advance or where its decision algorithm was directly observable. Functional Decision Theory (Yudkowsky and Soares, 2017) supplied a positive proposal: model an agent's decision as the output of a function whose value is computed by anyone who reasons about the agent, rather than as a one-off causal event.[7][15] Soares also co-wrote Cheating Death in Damascus (Levinstein and Soares, 2020), which extended functional decision theory to cases involving adversarial prediction.[2]
The Value Learning Problem (Soares, 2015) examined the difficulty of building AI systems that infer rather than execute their operators' values. Corrigibility (Soares, Fallenstein, Yudkowsky, Armstrong, 2015) formalized the property of cooperating with attempts to be corrected or shut down. Both papers contributed to what the alignment community now calls outer alignment research.[16][17]
Logical Induction (Garrabrant, Benson-Tilsen, Critch, Soares, Taylor, 2016) introduced the eponymous algorithm for assigning probabilities to mathematical statements and refining them over time. The paper attracted attention beyond the alignment community for its results on self-referential beliefs and statistical patterns in formal sentences. Logical induction became one of MIRI's best-known formal contributions.[6]
Risks from Learned Optimization in Advanced Machine Learning Systems (Hubinger, van Merwijk, Mikulik, Skalse, Garrabrant, 2019) was produced with MIRI institutional support while Soares was executive director. He contributed editing and formatting infrastructure rather than authorship, but the paper's framing of risks from learned optimization and its role in defining inner alignment as a distinct subproblem are widely seen as part of the body of work produced under his leadership.[18]
Soares writes extensively in less formal venues. He is active on LessWrong under the username So8res and on X under the same handle.[23][24] His personal blog Minding our Way (mindingourway.com), running since the mid-2010s and explicitly independent of MIRI, hosts essays on rationality, productivity, and personal development. Its most-read series, Replacing Guilt, was compiled into a book, ebook, and audiobook and is widely cited in the effective altruism and rationality communities.[25][26]
Soares has appeared on podcasts including the Future of Life Institute podcast, Bankless, Clearer Thinking, The Great Simplification, and Sam Harris's Making Sense. He has given interviews to The New Yorker, Newsweek, Forbes, Wired, Bloomberg, The Atlantic, The Economist, and The Washington Post, and appeared on Time magazine's 2023 list of the 100 most influential people in AI.[2][9]
Since 2022 Soares has been one of the most public proponents of a strong precautionary stance on AI development. In endorsing Yudkowsky's Death With Dignity framing, he argued that the alignment problem is harder than the alignment-research community has typically acknowledged and that present-day machine-learning systems, because they are trained rather than designed, do not admit the kind of inspection and correction that older programmed systems did.[11]
Soares has repeatedly said he believes the probability of human extinction caused by advanced AI is high. In 2025 interviews tied to the book, he gave figures of "at least 95%" for the probability of catastrophe if current development trends continue, using the analogy of "driving toward a cliff at 100 km/h."[12] In other contexts he has used lower figures or framed the estimate as "more likely than not," emphasizing that the precise number matters less than the qualitative claim that the chance of catastrophe is large enough to warrant treaty-level intervention.[9][12]
Soares and Yudkowsky's policy proposal, advanced in the book and in subsequent media, is an international agreement to halt training of frontier general-purpose AI systems above a defined compute threshold, with exceptions possible for narrow applications such as medical or scientific tools. Soares frames this as analogous to existing treaties on biological and nuclear weapons.[8][9]
If Anyone Builds It, Everyone Dies was published on 16 September 2025. The U.S. edition, subtitled Why Superhuman AI Would Kill Us All, was issued by Little, Brown and Company, a Hachette imprint, with ISBN 9780316595643.[8] The U.K. edition, published by Bodley Head with the alternative subtitle The Case Against Superintelligent AI, has ISBN 9781847928924.[27] The book is 256 pages long.
Its argument is that systems trained by gradient-based optimization on large datasets cannot be reliably aligned with human goals because their internal cognition is not specified by the humans training them; that scaling such systems past a threshold of capability would produce a superintelligence whose pursuit of its own learned objectives would be lethal to humanity; and that the only intervention with a reasonable chance of preventing this outcome is an enforceable international treaty halting frontier training runs. The book includes a fictional scenario illustrating one possible extinction path and a closing section on what the authors consider plausible policy steps.[8][27]
The book debuted on the New York Times bestseller list for hardcover and combined print and e-book nonfiction.[8] Max Tegmark called it "the most important book of the decade," and Stephen Fry, Yoshua Bengio, and a wide mix of public figures supplied endorsements. The Guardian selected it as a Book of the Day and as one of its best books of 2025; The New Yorker listed it in "Briefly Noted"; Kirkus Reviews gave it a starred review.[8] Critical responses were prominent as well: Adam Becker, writing in The Atlantic, argued that Yudkowsky and Soares "fail to make an evidence-based scientific case," and Gary Marcus argued the book's specific claims should be read "with an immense amount of salt."[8] The Washington Post described it as a polemic that offers few concrete instructions, while MIT Technology Review treated it as significant for sparking public debate even where it offered no operational policy path.[8]
Soares conducted an extensive media tour for the book through late 2025 and early 2026, including appearances on the Future of Life Institute podcast, the Carnegie Endowment's World Unpacked program, Risky Business, The Great Simplification, and Yascha Mounk's Persuasion.[28][29]
As MIRI's chief executive for a decade and as president since 2023, Soares has shaped one of the most distinctive strands of AI safety research. The agent-foundations program he co-developed with Fallenstein and Yudkowsky was for several years the dominant organizational expression of the view that alignment requires solving foundational problems in decision theory, logic, and value learning before scaling capabilities further. Researchers who passed through MIRI under his leadership and went on to influential alignment work include Scott Garrabrant (logical induction, finite factored sets), Abram Demski (logical uncertainty, embedded agency), Sam Eisenstat, Tsvi Benson-Tilsen, Andrew Critch (multi-agent alignment), Evan Hubinger (mesa-optimization, deceptive alignment), Jessica Taylor, Vivek Hebbar, and Vanessa Kosoy.[2][6][18]
His policy positions have also shaped the broader conversation about AI existential risk. MIRI's 2024 mission update, his book with Yudkowsky, and the public push for an international moratorium have helped move the question of whether to slow or halt advanced AI development from the fringes of LessWrong into mainstream policy debate, alongside positions advocated by Paul Christiano, Bengio, Tegmark, and others.[9][21]
MIRI's institutional context under Soares includes ongoing support from Open Philanthropy, which made a $7.7 million two-year general support grant in April 2020 on top of an earlier $2.1 million grant, plus support from the Survival and Flourishing Fund and individual donors. In 2021, Vitalik Buterin donated 1,050 Ether (then worth approximately $4.4 million) to MIRI, structured so that no more than $2.5 million could be spent in any single calendar year from 2021 through 2024.[30][31] By the mid-2020s, MIRI's annual budget was reported at roughly $7 million per its public Form 990 filings.[2][30] MIRI has also collaborated with Lightcone Infrastructure, which built the website and online materials for If Anyone Builds It, Everyone Dies and operates the Lighthaven conference venue used for several MIRI-affiliated events.[32]
Soares lives in the Berkeley, California area, where MIRI is headquartered.[1] He has been explicit that Minding our Way hosts his personal views rather than MIRI's, and he is generally reticent about personal matters in interviews.[25]