# Nate Soares

> Source: https://aiwiki.ai/wiki/nate_soares
> Updated: 2026-04-28
> Categories: AI Safety, People
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**Nate Soares** (sometimes credited as **Nathan Soares**) is an American computer scientist and [AI safety](ai_safety) researcher who serves as president of the [Machine Intelligence Research Institute](miri) (MIRI), a nonprofit research organization in Berkeley, California, focused on the technical and policy challenges of [AI alignment](ai_alignment) and [existential risk from advanced AI](ai_existential_risk).[1][2] Born in 1989, Soares trained as a software engineer and worked at [Microsoft](microsoft) and [Google](google) before joining MIRI as a research fellow in 2014. He became MIRI's executive director in May 2015, succeeding Luke Muehlhauser, and transitioned to the title of president in 2023 when long-time chief operating officer Malo Bourgon assumed the role of CEO.[3][4]

Soares is the author of a substantial body of technical writing on [decision theory](decision_theory), value learning, and the agent-foundations approach to alignment, including the 2014 paper that introduced the term "AI alignment" into wider research use, the 2016 paper *Logical Induction* (with Scott Garrabrant and others), and the 2017 paper *Functional Decision Theory* (with [Eliezer Yudkowsky](eliezer_yudkowsky)).[5][6][7] In 2025 he co-authored *If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All* with Yudkowsky, which became a *New York Times* bestseller and brought MIRI's pessimistic view of [superintelligence](superintelligence) to a mainstream audience.[8][9]

Under Soares's leadership, MIRI has shifted strategic focus several times: from foundational mathematical research in the mid-2010s, to a partly nondisclosed research program from 2017 onward, and from 2022 to a public communications and policy stance organized around an international moratorium on training larger AI systems.[10][11] Soares argues that current methods for building general-purpose AI are extremely unlikely to produce systems humans can control, and he has stated probability estimates for AI-caused human extinction in the high double digits.[9][12]

## Background and early career

Soares was born in 1989 and completed a Bachelor of Science in computer science and economics at George Washington University in 2011.[2] During and immediately after his undergraduate studies he held technical positions in government and government-adjacent work: a research-associate role at the National Institute of Standards and Technology, where he wrote software to help architects upgrade building designs to meet stricter energy standards, and a contracting role for the U.S. Department of Defense, building software tools used by the National Defense University.[2]

He then moved into the technology industry. At [Microsoft](microsoft) he was a software development engineer on Office 2013, working on user-interface details including the status bar, message bar visuals, high-contrast and high-DPI display modes, icon and media rendering, and parts of the ribbon interface.[2] He subsequently joined [Google](google) as a software engineer on Google Compute Engine and on tooling that integrated the Vim text editor with Google's internal development environment.[2]

While at Google, Soares grew increasingly interested in the technical and philosophical challenges of building general-purpose AI. He attended his first MIRI workshop in December 2013, was offered a position shortly afterward, and left Google in 2014 to join MIRI full time.[3][4]

## Joining MIRI

Soares joined MIRI as a research fellow in 2014. The institute, then under executive director Luke Muehlhauser, was formalizing its "agent foundations" research agenda: a set of mathematical and decision-theoretic problems that MIRI argued any sufficiently capable AI system would have to solve in order to behave reliably.[5][13]

In his first year, Soares authored or co-authored about a dozen papers, six of them written specifically for MIRI's technical research agenda.[3] He also produced the institute's research guide for prospective alignment researchers, attended five external conferences, and presented at three of them, including a workshop at Purdue University in September 2014 and the Future of Life Institute's Puerto Rico conference in January 2015.[14] Two of his early collaborations with Benja Fallenstein, *Aligning Superintelligence with Human Interests: A Technical Research Agenda* (2014), which helped popularize the term "AI alignment," and *Toward Idealized Decision Theory* (2014), set the framing for years of subsequent MIRI work.[5][15]

## Presidency (2015 to present)

### Becoming executive director

MIRI announced in May 2015 that Soares would succeed Muehlhauser as executive director, with the transition taking effect on 31 May 2015.[3][4] Muehlhauser, who had run the institute since 2011, left to take a research position at GiveWell. Soares said his immediate priority was expanding the research team and announced that Jessica Taylor would join MIRI full time in August 2015.[3] He had previously co-founded startups, run a F.I.R.S.T. Robotics team, and served as president of an Entrepreneur's Club, but he described the MIRI role as his first formal professional leadership position.[4]

### 2015 to 2018: foundational research

During Soares's first three years as executive director, MIRI concentrated on the foundational mathematical agenda set out in its 2014 and 2015 technical reports. Major outputs included the 2015 AAAI workshop paper *Corrigibility*, co-authored with Fallenstein, Yudkowsky, and Stuart Armstrong, which formalized the problem of building AI systems that cooperate with attempts to correct or shut them down, and the 2016 *Logical Induction* paper, with Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, and Jessica Taylor, which proposed a computable algorithm for assigning probabilities to logical sentences.[6][16]

In 2017 Soares and Yudkowsky published *Functional Decision Theory: A New Theory of Instrumental Rationality*, which formally described a decision theory that treats an agent's choice as the output of a fixed mathematical function rather than as a one-off causal event. They argued that [functional decision theory](functional_decision_theory) outperforms causal and evidential decision theory on a range of problems involving prediction, commitment, and self-similar agents.[7] Soares also published *The Value Learning Problem* (2015, revised 2018), surveying approaches to building AI systems that infer rather than execute human values.[17]

### 2017 to 2022: nondisclosed research and reduced public output

In April 2017 MIRI published a strategy update announcing a shift toward exploratory directions that it did not plan to share publicly until late 2017 at the earliest. By November 2018, the institute formalized this stance: most internal research would be "nondisclosed by default," with outputs released only when there was a specific reason to publish.[10] Soares argued that some MIRI research was sufficiently close to capabilities work that the marginal harm of accelerating timelines outweighed the marginal benefit of public scrutiny.[10] In December 2020 Soares wrote in MIRI's annual update that the institute's 2017 research push "has, at this point, largely failed, in the sense that neither Eliezer nor I have sufficient hope in it for us to continue focusing our main efforts there."[10] During this period MIRI continued to publish smaller technical pieces and host the *Risks from Learned Optimization in Advanced Machine Learning Systems* paper (Hubinger, van Merwijk, Mikulik, Skalse, Garrabrant, 2019), which introduced [mesa-optimization](mesa_optimization) into the [inner alignment](inner_alignment) literature.[18]

### 2022 to present: communications and policy

In April 2022 Yudkowsky published the [LessWrong](lesswrong) post "MIRI announces new 'Death With Dignity' strategy," which framed the institute's near-term work as raising the probability that humanity would "die with slightly more dignity" given his low estimated odds of solving alignment in time. Soares endorsed the post's underlying analysis in later commentary, arguing that the most useful alignment work might involve waiting for human cognitive enhancement to make the problems tractable.[11][19]

A more concrete shift was announced in October 2023, when MIRI made public a leadership reshuffle that had been piloted in February 2023 and finalized in June 2023. Bourgon, the long-serving COO, became CEO; Soares moved from executive director to president; Yudkowsky became chair of the board; Alex Vermeer became COO; Jimmy Rintjema became CFO; and Lisa Thiergart and Gretta Duleba joined as research and communications managers.[20] The post stated that "policy and communications will be a higher priority for MIRI than research going forward."[20]

The January 2024 *MIRI 2024 Mission and Strategy Update* organized this position around three priorities: international policy work aimed at halting development of frontier general-purpose AI until alignment is better understood, communications aimed at making the institute's analysis legible to non-technical audiences, and a reduced but ongoing technical research program supporting the first two priorities.[21] In 2025 the institute's communications work culminated in Soares and Yudkowsky's book *If Anyone Builds It, Everyone Dies*.[8][9]

## Research contributions

Soares's published research falls into several overlapping clusters.

### Agent foundations and research agenda

The 2014 paper *Aligning Superintelligence with Human Interests: A Technical Research Agenda*, co-written with Fallenstein, set out the agent-foundations program that defined MIRI's direction for most of the 2010s. Its companion *Annotated Bibliography* (2014) organized then-existing AI safety literature, and the 2015 update *Agent Foundations for Aligning Machine Intelligence with Human Interests* re-framed the same broad program.[5][22]

### Decision theory

*Toward Idealized Decision Theory* (Soares and Fallenstein, 2014) argued that existing causal and evidential decision theories failed in environments where an agent's choices were predictable in advance or where its decision algorithm was directly observable. *Functional Decision Theory* (Yudkowsky and Soares, 2017) supplied a positive proposal: model an agent's decision as the output of a function whose value is computed by anyone who reasons about the agent, rather than as a one-off causal event.[7][15] Soares also co-wrote *Cheating Death in Damascus* (Levinstein and Soares, 2020), which extended functional decision theory to cases involving adversarial prediction.[2]

### Value learning and corrigibility

*The Value Learning Problem* (Soares, 2015) examined the difficulty of building AI systems that infer rather than execute their operators' values. *Corrigibility* (Soares, Fallenstein, Yudkowsky, Armstrong, 2015) formalized the property of cooperating with attempts to be corrected or shut down. Both papers contributed to what the alignment community now calls [outer alignment](outer_alignment) research.[16][17]

### Logical uncertainty

*Logical Induction* (Garrabrant, Benson-Tilsen, Critch, Soares, Taylor, 2016) introduced the eponymous algorithm for assigning probabilities to mathematical statements and refining them over time. The paper attracted attention beyond the alignment community for its results on self-referential beliefs and statistical patterns in formal sentences. [Logical induction](logical_induction) became one of MIRI's best-known formal contributions.[6]

### Mesa-optimization (organizational role)

*Risks from Learned Optimization in Advanced Machine Learning Systems* (Hubinger, van Merwijk, Mikulik, Skalse, Garrabrant, 2019) was produced with MIRI institutional support while Soares was executive director. He contributed editing and formatting infrastructure rather than authorship, but the paper's framing of [risks from learned optimization](risks_from_learned_optimization) and its role in defining [inner alignment](inner_alignment) as a distinct subproblem are widely seen as part of the body of work produced under his leadership.[18]

## Public communications

Soares writes extensively in less formal venues. He is active on [LessWrong](lesswrong) under the username `So8res` and on X under the same handle.[23][24] His personal blog *Minding our Way* (mindingourway.com), running since the mid-2010s and explicitly independent of MIRI, hosts essays on rationality, productivity, and personal development. Its most-read series, *Replacing Guilt*, was compiled into a book, ebook, and audiobook and is widely cited in the [effective altruism](effective_altruism) and rationality communities.[25][26]

Soares has appeared on podcasts including the [Future of Life Institute](future_of_life_institute) podcast, [Bankless](bankless), *Clearer Thinking*, *The Great Simplification*, and Sam Harris's *Making Sense*. He has given interviews to *The New Yorker*, *Newsweek*, *Forbes*, *Wired*, *Bloomberg*, *The Atlantic*, *The Economist*, and *The Washington Post*, and appeared on *Time* magazine's 2023 list of the 100 most influential people in AI.[2][9]

## Death with Dignity and the AI moratorium argument

Since 2022 Soares has been one of the most public proponents of a strong precautionary stance on AI development. In endorsing Yudkowsky's *Death With Dignity* framing, he argued that the alignment problem is harder than the alignment-research community has typically acknowledged and that present-day machine-learning systems, because they are trained rather than designed, do not admit the kind of inspection and correction that older programmed systems did.[11]

Soares has repeatedly said he believes the probability of human extinction caused by advanced AI is high. In 2025 interviews tied to the book, he gave figures of "at least 95%" for the probability of catastrophe if current development trends continue, using the analogy of "driving toward a cliff at 100 km/h."[12] In other contexts he has used lower figures or framed the estimate as "more likely than not," emphasizing that the precise number matters less than the qualitative claim that the chance of catastrophe is large enough to warrant treaty-level intervention.[9][12]

Soares and Yudkowsky's policy proposal, advanced in the book and in subsequent media, is an international agreement to halt training of frontier general-purpose AI systems above a defined compute threshold, with exceptions possible for narrow applications such as medical or scientific tools. Soares frames this as analogous to existing treaties on biological and nuclear weapons.[8][9]

## *If Anyone Builds It, Everyone Dies* (2025)

*If Anyone Builds It, Everyone Dies* was published on 16 September 2025. The U.S. edition, subtitled *Why Superhuman AI Would Kill Us All*, was issued by Little, Brown and Company, a Hachette imprint, with ISBN 9780316595643.[8] The U.K. edition, published by Bodley Head with the alternative subtitle *The Case Against Superintelligent AI*, has ISBN 9781847928924.[27] The book is 256 pages long.

Its argument is that systems trained by gradient-based optimization on large datasets cannot be reliably aligned with human goals because their internal cognition is not specified by the humans training them; that scaling such systems past a threshold of capability would produce a [superintelligence](superintelligence) whose pursuit of its own learned objectives would be lethal to humanity; and that the only intervention with a reasonable chance of preventing this outcome is an enforceable international treaty halting frontier training runs. The book includes a fictional scenario illustrating one possible extinction path and a closing section on what the authors consider plausible policy steps.[8][27]

The book debuted on the *New York Times* bestseller list for hardcover and combined print and e-book nonfiction.[8] Max Tegmark called it "the most important book of the decade," and Stephen Fry, Yoshua Bengio, and a wide mix of public figures supplied endorsements. *The Guardian* selected it as a Book of the Day and as one of its best books of 2025; *The New Yorker* listed it in "Briefly Noted"; *Kirkus Reviews* gave it a starred review.[8] Critical responses were prominent as well: Adam Becker, writing in *The Atlantic*, argued that Yudkowsky and Soares "fail to make an evidence-based scientific case," and Gary Marcus argued the book's specific claims should be read "with an immense amount of salt."[8] *The Washington Post* described it as a polemic that offers few concrete instructions, while MIT Technology Review treated it as significant for sparking public debate even where it offered no operational policy path.[8]

Soares conducted an extensive media tour for the book through late 2025 and early 2026, including appearances on the [Future of Life Institute](future_of_life_institute) podcast, the Carnegie Endowment's *World Unpacked* program, *Risky Business*, *The Great Simplification*, and Yascha Mounk's *Persuasion*.[28][29]

## Influence on AI safety

As MIRI's chief executive for a decade and as president since 2023, Soares has shaped one of the most distinctive strands of [AI safety](ai_safety) research. The agent-foundations program he co-developed with Fallenstein and Yudkowsky was for several years the dominant organizational expression of the view that alignment requires solving foundational problems in decision theory, logic, and value learning before scaling capabilities further. Researchers who passed through MIRI under his leadership and went on to influential alignment work include [Scott Garrabrant](scott_garrabrant) (logical induction, finite factored sets), [Abram Demski](abram_demski) (logical uncertainty, embedded agency), Sam Eisenstat, Tsvi Benson-Tilsen, [Andrew Critch](andrew_critch) (multi-agent alignment), [Evan Hubinger](evan_hubinger) (mesa-optimization, deceptive alignment), Jessica Taylor, Vivek Hebbar, and Vanessa Kosoy.[2][6][18]

His policy positions have also shaped the broader conversation about [AI existential risk](ai_existential_risk). MIRI's 2024 mission update, his book with Yudkowsky, and the public push for an international moratorium have helped move the question of whether to slow or halt advanced AI development from the fringes of [LessWrong](lesswrong) into mainstream policy debate, alongside positions advocated by [Paul Christiano](paul_christiano), Bengio, Tegmark, and others.[9][21]

MIRI's institutional context under Soares includes ongoing support from [Open Philanthropy](open_philanthropy), which made a $7.7 million two-year general support grant in April 2020 on top of an earlier $2.1 million grant, plus support from the Survival and Flourishing Fund and individual donors. In 2021, Vitalik Buterin donated 1,050 Ether (then worth approximately $4.4 million) to MIRI, structured so that no more than $2.5 million could be spent in any single calendar year from 2021 through 2024.[30][31] By the mid-2020s, MIRI's annual budget was reported at roughly $7 million per its public Form 990 filings.[2][30] MIRI has also collaborated with [Lightcone Infrastructure](lightcone_infrastructure), which built the website and online materials for *If Anyone Builds It, Everyone Dies* and operates the Lighthaven conference venue used for several MIRI-affiliated events.[32]

## Personal life

Soares lives in the Berkeley, California area, where MIRI is headquartered.[1] He has been explicit that *Minding our Way* hosts his personal views rather than MIRI's, and he is generally reticent about personal matters in interviews.[25]

## See also

- [Eliezer Yudkowsky](eliezer_yudkowsky)
- [Machine Intelligence Research Institute](miri)
- [AI alignment](ai_alignment)
- [AI safety](ai_safety)
- [AI existential risk](ai_existential_risk)
- [Superintelligence](superintelligence)
- [Decision theory](decision_theory)
- [Functional decision theory](functional_decision_theory)
- [Logical induction](logical_induction)
- [Mesa-optimization](mesa_optimization)
- [Inner alignment](inner_alignment)
- [Risks from learned optimization](risks_from_learned_optimization)
- [Friendly AI](friendly_ai)
- [LessWrong](lesswrong)
- [Effective altruism](effective_altruism)
- [Open Philanthropy](open_philanthropy)
- [Lightcone Infrastructure](lightcone_infrastructure)
- [Future of Life Institute](future_of_life_institute)
- [Centre for Effective Altruism](centre_for_effective_altruism)
- [80,000 Hours](80000_hours)
- [Paul Christiano](paul_christiano)
- [Scott Garrabrant](scott_garrabrant)
- [Abram Demski](abram_demski)
- [Evan Hubinger](evan_hubinger)
- [Andrew Critch](andrew_critch)

## References

1. "Team." Machine Intelligence Research Institute. https://intelligence.org/team/
2. "Nate Soares." Wikipedia. https://en.wikipedia.org/wiki/Nate_Soares
3. Soares, Nate. "Introductions." Machine Intelligence Research Institute, 31 May 2015. https://intelligence.org/2015/05/31/introductions/
4. Soares, Nate. "Taking the reins at MIRI." LessWrong, 2015. https://www.lesswrong.com/posts/KYdtcz6N8qRDzwwKN/taking-the-reins-at-miri
5. Soares, Nate, and Benja Fallenstein. *Aligning Superintelligence with Human Interests: A Technical Research Agenda*. MIRI Technical Report 2014-8, 2014. https://intelligence.org/files/TechnicalAgenda.pdf
6. Garrabrant, Scott, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, and Jessica Taylor. "Logical Induction." arXiv:1609.03543, September 2016. https://arxiv.org/abs/1609.03543
7. Yudkowsky, Eliezer, and Nate Soares. "Functional Decision Theory: A New Theory of Instrumental Rationality." arXiv:1710.05060, October 2017. https://arxiv.org/abs/1710.05060
8. "If Anyone Builds It, Everyone Dies." Wikipedia. https://en.wikipedia.org/wiki/If_Anyone_Builds_It,_Everyone_Dies
9. Yudkowsky, Eliezer, and Nate Soares. *If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All*. Little, Brown and Company, 16 September 2025. ISBN 9780316595643. https://www.hachettebookgroup.com/titles/eliezer-yudkowsky/if-anyone-builds-it-everyone-dies/9780316595643/
10. "2018 Update: Our New Research Directions." Machine Intelligence Research Institute, 22 November 2018. https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/
11. Yudkowsky, Eliezer. "MIRI announces new 'Death With Dignity' strategy." LessWrong, 1 April 2022. https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy
12. "AI poses 95% risk of human extinction, expert says." PanARMENIAN.Net, 2025. https://panarmenian.net/eng/news/325063/
13. "2017 Updates and Strategy." Machine Intelligence Research Institute, 30 April 2017. https://intelligence.org/2017/04/30/2017-updates-and-strategy/
14. "Nate Soares speaking at Purdue University." Machine Intelligence Research Institute, 12 September 2014. https://intelligence.org/2014/09/12/nate-soares-speaking-purdue-september-18th/
15. Soares, Nate, and Benja Fallenstein. "Toward Idealized Decision Theory." arXiv:1507.01986, December 2014. https://arxiv.org/abs/1507.01986
16. Soares, Nate, Benja Fallenstein, Eliezer Yudkowsky, and Stuart Armstrong. "Corrigibility." AAAI Workshops, 2015. https://intelligence.org/files/Corrigibility.pdf
17. Soares, Nate. "The Value Learning Problem." MIRI, 2015 (revised 2018). https://intelligence.org/files/ValueLearningProblem.pdf
18. Hubinger, Evan, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant. "Risks from Learned Optimization in Advanced Machine Learning Systems." arXiv:1906.01820, 2019. https://arxiv.org/abs/1906.01820
19. Soares, Nate. "On how various plans miss the hard bits of the alignment challenge." LessWrong / Alignment Forum, July 2022. https://forum.effectivealtruism.org/posts/jydymb23NWF3Q4oDt/on-how-various-plans-miss-the-hard-bits-of-the-alignment
20. "Announcing MIRI's new CEO and leadership team." LessWrong, 10 October 2023. https://www.lesswrong.com/posts/NjtHt55nFbw3gehzY/announcing-miri-s-new-ceo-and-leadership-team
21. "MIRI 2024 Mission and Strategy Update." Machine Intelligence Research Institute, 4 January 2024. https://intelligence.org/2024/01/04/miri-2024-mission-and-strategy-update/
22. Soares, Nate. "Aligning Superintelligence with Human Interests: An Annotated Bibliography." Machine Intelligence Research Institute, 2014. https://intelligence.org/files/AnnotatedBibliography.pdf
23. "So8res - LessWrong user." LessWrong. https://www.lesswrong.com/users/so8res
24. Soares, Nate. "@So8res." X (formerly Twitter). https://x.com/So8res
25. Soares, Nate. "About Me." Minding our Way. https://mindingourway.com/about/
26. Soares, Nate. *Replacing Guilt: Minding Our Way*. https://www.goodreads.com/book/show/52990973-replacing-guilt
27. Yudkowsky, Eliezer, and Nate Soares. *If Anyone Builds It, Everyone Dies: The Case Against Superintelligent AI*. Bodley Head, 2025. ISBN 9781847928924. https://www.penguin.co.uk/books/474267/if-anyone-builds-it-everyone-dies-by-soares-eliezer-yudkowsky-and-nate/9781847928924
28. "Why Building Superintelligence Means Human Extinction (with Nate Soares)." Future of Life Institute Podcast, October 2025. https://futureoflife.org/podcast/why-building-superintelligence-means-human-extinction-with-nate-soares/
29. "Will AI Kill us All? Nate Soares on His Controversial Bestseller." Carnegie Endowment for International Peace, *The World Unpacked*, 2025. https://carnegieendowment.org/podcasts/the-world-unpacked/will-ai-kill-us-all-nate-soares-on-his-controversial-bestseller
30. "Our all-time largest donation, and major crypto support from Vitalik Buterin." Machine Intelligence Research Institute, 13 May 2021. https://intelligence.org/2021/05/13/two-major-donations/
31. "Machine Intelligence Research Institute." Wikipedia. https://en.wikipedia.org/wiki/Machine_Intelligence_Research_Institute
32. "Lightcone Infrastructure." Manifund, 2025. https://manifund.com/projects/lightcone-infrastructure

