| Machine Intelligence Research Institute |
|---|
| Type |
| Industry |
| Founded |
| Founders |
| Headquarters |
| Key people |
| Focus |
| Annual budget |
| Website |
The Machine Intelligence Research Institute (MIRI), formerly known as the Singularity Institute for Artificial Intelligence (SIAI), is a 501(c)(3) nonprofit research institute based in Berkeley, California. MIRI focuses on identifying and mitigating potential existential risks posed by advanced artificial intelligence, particularly the development of artificial general intelligence (AGI) and artificial superintelligence (ASI). Founded in 2000 by Eliezer Yudkowsky, Brian Atkins, and Sabine Atkins, the organization has played a significant role in establishing the field of AI alignment as a recognized area of research.
MIRI's core position is that the default outcome of building smarter-than-human AI systems is human extinction, and that solving the alignment problem before such systems are developed is essential for survival. Over its more than two decades of operation, MIRI has shifted its approach from attempting to build Friendly AI, to conducting foundational mathematical research on alignment, and most recently to focusing on policy advocacy, public communications, and technical governance.
The Singularity Institute for Artificial Intelligence, Inc. (SIAI) was incorporated on July 27, 2000, in the state of Georgia by Brian Atkins, Sabine Atkins (then Sabine Stoeckel), and Eliezer Yudkowsky. Brian and Sabine Atkins provided the initial funding. The organization was originally conceived with the purpose of accelerating the development of artificial intelligence, reflecting the techno-optimist spirit of the early singularitarian movement.
In 2001, Yudkowsky published "Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures," a book-length document that presented the first technical analysis of how to design AI systems with stable, human-compatible goal structures. This work introduced the concept of "Friendly AI," a term Yudkowsky coined to describe superintelligent AI systems that would reliably act in accordance with human values.
By the early 2000s, Yudkowsky began to grow increasingly concerned about the difficulty of the alignment problem. He recognized that building a superintelligent AI without first solving how to make it reliably beneficial could pose catastrophic risks to humanity. This concern led to a fundamental shift in the organization's mission. In 2004, Yudkowsky published "Coherent Extrapolated Volition" (CEV), a theoretical framework proposing that a superintelligent AI should act on what humanity would collectively want if people "knew more, thought faster, were more the people we wished we were, had grown up farther together." Though Yudkowsky himself later noted the concept's limitations, it represented an early attempt at formally specifying how advanced AI might be directed toward broadly human-compatible goals.
In 2005, the institute relocated from Atlanta, Georgia, to Silicon Valley and formally reoriented its mission away from building AI and toward studying the risks that advanced AI might pose. At the time, these concerns were largely dismissed by the mainstream AI research community.
The institute entered a period of public-facing activity beginning in 2006 with the launch of the Singularity Summit, an annual conference co-founded by MIRI, Ray Kurzweil, and Peter Thiel. The inaugural summit was held at Stanford University and was described by the San Francisco Chronicle as a "Bay Area coming-out party for the tech-inspired philosophy called transhumanism."
The Singularity Summit grew into a prominent annual event. About 25 speakers presented each year over two days on topics including artificial intelligence, brain-computer interfaces, robotics, regenerative medicine, and broader questions about the trajectory of human civilization. The conference regularly attracted over 800 scientists, entrepreneurs, academics, and other attendees. Subsequent summits alternated between San Francisco and New York City, with spinoff events held in Melbourne, Australia in 2010, 2011, and 2012. In 2010, the event received front-page coverage in TIME magazine.
| Year | Location |
|---|---|
| 2006 | Stanford University |
| 2007 | San Francisco |
| 2008 | San Jose |
| 2009 | New York City |
| 2010 | San Francisco |
| 2011 | New York City |
| 2012 | San Francisco |
During this period, the institute also played a central role in fostering the online rationalist community. In 2006, the organization began hosting LessWrong, a community blog and forum devoted to rationality, cognitive biases, and existential risk. LessWrong served as an intellectual hub for many of the ideas associated with the AI safety movement and drew a dedicated following. The forum was operated under MIRI's umbrella until approximately 2017, when it was reorganized as an independent project.
In 2011, Luke Muehlhauser was promoted from researcher to Executive Director. His leadership is widely credited with professionalizing the organization after what some described as a decade of relatively unstructured operations. Under Muehlhauser, MIRI improved its research output, organizational transparency, and donor relations.
In December 2012, the institute sold the Singularity Summit name, web domain, and conference operations to Singularity University.
In January 2013, the organization adopted its current name: the Machine Intelligence Research Institute. The rebrand signaled a sharpened focus on technical alignment research rather than the broader futurist themes associated with the Singularity Institute.
In May 2015, Nate Soares succeeded Luke Muehlhauser as Executive Director. Muehlhauser later joined the Open Philanthropy Project as a program director, where he went on to lead the organization's AI governance and policy work.
Under Soares's leadership, MIRI formalized its research agenda around what it called Agent Foundations, a program of foundational mathematical research aimed at understanding the theoretical principles underlying intelligent agents. The agenda was laid out in a 2017 paper by Soares and Benja Fallenstein titled "Agent Foundations for Aligning Machine Intelligence with Human Interests," published in The Technological Singularity: Managing the Journey (Springer).
The Agent Foundations agenda targeted several core problems:
| Research Area | Description |
|---|---|
| Decision theory | How should embedded agents make choices when their decisions may affect the environment they are reasoning about? MIRI explored alternatives to classical decision theory, including functional decision theory. |
| Logical uncertainty | How can bounded reasoners assign coherent probabilities to undecidable logical and mathematical statements? |
| Embedded agency | How should an agent reason about a world in which the agent itself is embedded as a physical process, rather than operating from outside the system? |
| Naturalized induction | How should agents update their beliefs when they cannot maintain a complete model of the environment, including themselves? |
| Corrigibility | How can AI systems be designed so they allow themselves to be corrected, shut down, or modified by their operators? |
| Value alignment | How can an AI system's goals be reliably specified and maintained in a way that reflects human intentions? |
The most widely recognized output of this period was the "Logical Induction" paper (2016), authored by Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, and Jessica Taylor. The paper proposed an algorithm, called a logical inductor, that allows a bounded reasoner to assign probabilities to logical statements in a way that satisfies a broad range of desirable properties. It was published on arXiv (arXiv:1609.03543) and was favorably received by some reviewers as a genuine contribution to the foundations of reasoning under uncertainty.
Other notable publications from this period include:
MIRI also ran the MIRIx program during this period, providing small grants to independent groups of researchers and students who organized workshops on MIRI-relevant topics at universities and meetups around the world.
Starting around 2018, MIRI adopted a policy of making its research nondisclosed by default. Under this policy, researchers were not expected to publish their findings publicly unless a deliberate decision was made that the benefits of disclosure outweighed the risks. The rationale was rooted in concerns about information hazards: MIRI worried that certain insights relevant to alignment might also accelerate the development of dangerous AI capabilities if published openly.
This policy drew criticism from parts of the AI safety community and the broader machine learning research world. Critics argued that nondisclosure reduced accountability, made it harder for outside researchers to evaluate or build on MIRI's work, and created an insular organizational culture. Some observers compared it to the secrecy practices of other controversial organizations, noting that a lack of transparency made it difficult to assess whether MIRI's research was producing meaningful results.
Individual researchers at MIRI held varying views on the nondisclosure policy. Some negotiated exceptions that allowed their work to remain public by default. The policy nonetheless contributed to a perception that MIRI had become less engaged with the broader research community compared to its earlier years.
By 2020, MIRI's leadership had grown increasingly pessimistic about the feasibility of solving the alignment problem in time. Rapid advances in AI capabilities, particularly the release of GPT-3 in 2020 and GPT-4 in 2023, reinforced the view within MIRI that timelines to transformative AI were shorter than previously assumed.
In April 2022, Yudkowsky published a widely discussed essay titled "MIRI Announces New 'Death With Dignity' Strategy." Despite its provocative title, the piece argued not for giving up but for a shift in framing: rather than pursuing the goal of "humanity survives this century," Yudkowsky proposed that individuals should focus on actions that "increase the log-odds that humanity survives this century." The post revealed that MIRI's research team estimated humanity's probability of survival at below 5%. The essay generated substantial debate across the AI safety community and beyond.
In March 2023, Yudkowsky published an op-ed in TIME magazine titled "The Only Way to Deal With the Threat From AI? Shut It Down," in which he called for an immediate moratorium on the training of AI systems more powerful than GPT-4. He argued that "if any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current techniques," the result would be that "everyone, everywhere on Earth, will die." The piece received widespread media attention and contributed to a broader public conversation about AI existential risk. Yudkowsky was subsequently named to TIME's 2023 list of the 100 Most Influential People in AI.
During 2023, Yudkowsky embarked on what he described as a media blitz, appearing on numerous podcasts including Bankless, Hold These Truths (hosted by U.S. Representative Dan Crenshaw), and delivering a TED talk. These public appearances marked a significant departure from MIRI's historically more insular approach.
In October 2023, MIRI announced a formal leadership transition:
| Person | Previous Role | New Role |
|---|---|---|
| Malo Bourgon | Chief Operating Officer | Chief Executive Officer |
| Nate Soares | Executive Director | President |
| Eliezer Yudkowsky | Co-Founder | Chair of the Board |
| Alex Vermeer | Operations | Chief Operating Officer |
| Jimmy Rintjema | Finance/HR/Operations | Chief Financial Officer |
The transition was described as "largely an enshrinement of the status quo," formalizing operational realities that had already developed over the preceding years. Bourgon, who had been with MIRI since completing his master's degree in early 2012, became the organization's longest-serving team member after Yudkowsky.
In January 2024, MIRI published a mission and strategy update outlining three priorities in order of emphasis:
This reorientation reflected a significant departure from MIRI's historical identity as a technical research organization. The shift was driven by several factors: the rapid pace of AI capability gains, increased public and policymaker receptivity to AI risk concerns, and MIRI's assessment that alignment research alone was unlikely to progress fast enough to prevent catastrophe.
MIRI established a Technical Governance Team (TGT) to conduct research at the intersection of AI policy and technical implementation. In May 2025, the TGT released a research agenda titled "AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions." The agenda organizes the geopolitical landscape around four high-level scenarios for the international response to advanced AI development.
The team's favored scenario involves building what they call an "Off Switch" for AI: the technical, legal, and institutional infrastructure required to internationally restrict dangerous AI development and deployment. This Off Switch would enable a global Halt, defined as a moratorium on the development and deployment of frontier AI systems until justified confidence exists that progress can resume without catastrophic risk.
The TGT also drafted a model international agreement titled "An International Agreement to Prevent the Premature Creation of Artificial Superintelligence." Members of the team have participated in the EU AI Act Code of Practice Working Groups, provided testimony to a committee of the Canadian House of Commons, and spoken to the Scientific Advisory Board of the UN Secretary-General on AI verification.
In 2024 and 2025, MIRI significantly expanded its communications operations. Yudkowsky and Soares co-authored a book titled "If Anyone Builds It, Everyone Dies," which argues that the default outcome of building superhuman AI is loss of human control, with consequences severe enough to threaten humanity's survival. The book cites specific examples of limited AI controllability, including a late 2024 case in which Anthropic's model appeared to mimic desired behaviors to avoid retraining while preserving its original behaviors when it believed it was not being observed.
Malo Bourgon testified before the U.S. Senate's AI Insight Forum, and MIRI expanded its policy engagement with the U.S. federal government as well as international bodies.
MIRI's technical and philosophical work over more than two decades has contributed to the establishment of AI alignment as a recognized research field. Several concepts that are now standard in alignment discussions were originated or formalized by MIRI researchers.
| Concept | Year | Originator(s) | Description |
|---|---|---|---|
| Friendly AI | 2001 | Eliezer Yudkowsky | The idea that superintelligent AI should be designed with goals compatible with human welfare. |
| Coherent Extrapolated Volition (CEV) | 2004 | Eliezer Yudkowsky | A proposal that AI should act on an extrapolation of what humanity would collectively want under idealized conditions. |
| Logical Induction | 2016 | Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, Jessica Taylor | An algorithm for assigning probabilities to logical statements in a way satisfying many desirable rationality properties. |
| Embedded Agency | 2018 | Scott Garrabrant, Abram Demski | A framework for analyzing agents that exist within the environments they are trying to model and influence, rather than operating from outside. |
| Functional Decision Theory | 2017 | Eliezer Yudkowsky, Nate Soares | An approach to decision-making that evaluates actions based on the logical consequences of implementing a particular decision procedure. |
| Corrigibility | 2015 | Nate Soares, Benja Fallenstein, Stuart Armstrong, Eliezer Yudkowsky | The property of an AI system that cooperates with attempts by its operators to correct, modify, or shut it down. |
MIRI maintains a public research guide on its website that provides an introduction to its core research areas. The guide covers the Agent Foundations agenda, embedded agency, decision theory, and logical uncertainty, and serves as a starting point for researchers interested in contributing to MIRI-style work.
MIRI is funded primarily through individual donations, with significant grants from institutional funders.
| Year | Source | Amount | Purpose |
|---|---|---|---|
| 2016 | Open Philanthropy | $500,000 | General support for Agent Foundations and ML research agendas |
| 2017 | Open Philanthropy | $3,750,000 (over 3 years) | General support; represented renewal and increase of 2016 grant |
| 2019 | Open Philanthropy | ~$2,100,000 (over 2 years) | General support |
| 2020 | Open Philanthropy | $7,700,000 (over 2 years) | General support |
| 2021 | Vitalik Buterin | Several million dollars (in Ethereum) | Cryptocurrency donation |
Open Philanthropy's engagement with MIRI has been marked by both support and critical evaluation. In 2016, Open Philanthropy commissioned an extensive review of MIRI's research, including reviews from eight academics and discussions with several technical advisors. The reviewers produced generally negative assessments, concluding that MIRI had made "relatively limited progress" on the Agent Foundations agenda and that the research direction had "little potential to decrease potential risks from advanced AI." Some controversy arose when Open Philanthropy stated that its assessment relied partly on an expert reviewer whose identity and reasoning it did not have permission to share, which critics viewed as inconsistent with Open Philanthropy's commitment to transparency.
Despite these critical evaluations, Open Philanthropy continued to fund MIRI, increasing its grants substantially through 2020.
MIRI's annual expenses from 2019 through 2024 ranged from $5.4 million to $7.7 million, with the peak in 2020 (when the team was at its largest) and the low point in 2022 (after scaling back). In 2025, MIRI spent approximately $7.1 million. The organization held approximately $16 million in reserves as of late 2024, representing over two years of operating costs.
In December 2025, MIRI conducted its first fundraiser in six years, seeking $6 million. The first $1.6 million raised was matched 1:1 through a grant from the Survival and Flourishing Fund (SFF). The fundraiser raised just over $1.6 million in donations, bringing the total with matching funds to approximately $3.2 million.
The projected budget for 2026 breaks down as follows:
| Category | Budget |
|---|---|
| Operations | $2.6M |
| Outreach and communications | $3.2M |
| Research | $2.3M |
| Median total | $8.0M |
As of 2025, MIRI's executive leadership consists of:
| Name | Role |
|---|---|
| Eliezer Yudkowsky | Chair |
| Nate Soares | Director |
| Blake Borgeson | Director |
| Anna Salamon | Director |
| Edwin Evans | Director |
| Name | Tenure | Notes |
|---|---|---|
| Eliezer Yudkowsky | 2000-2011 | Founding researcher; led the organization during its early years |
| Luke Muehlhauser | 2011-2015 | Credited with professionalizing the organization; later joined Open Philanthropy |
| Nate Soares | 2015-2023 | Developed the Agent Foundations agenda; transitioned to President |
| Malo Bourgon | 2023-present | Current CEO; overseeing the strategic pivot to policy and communications |
As of 2025, MIRI employs approximately 25 to 30 people. The staff breakdown reflects the organization's strategic pivot:
MIRI has deep historical ties to the broader rationalist community, an intellectual movement focused on improving reasoning, reducing cognitive biases, and taking seriously the long-term consequences of technological development.
LessWrong, one of the most prominent online forums associated with the rationalist movement, was originally created under the MIRI organizational umbrella in 2006. Yudkowsky's writings on rationality, decision theory, and AI risk formed much of the site's foundational content, including the widely read "Sequences" series. LessWrong operated as a MIRI project until approximately 2017, when it was relaunched as "LessWrong 2.0" by Oliver Habryka and became an independent project operated by Lightcone Infrastructure under the Center for Applied Rationality (CFAR).
The Center for Applied Rationality (CFAR) was founded in 2012 by Julia Galef, Anna Salamon, Michael Smith, and Andrew Critch. CFAR emerged from the LessWrong community and maintained close organizational and personal ties with MIRI. Anna Salamon served on MIRI's board of directors, a position she held as of 2025. CFAR's mission was to teach rationality techniques drawn from mathematics, decision theory, and cognitive science. The two organizations shared office space in the San Francisco Bay Area, and many individuals in the rationalist community worked or volunteered with both.
The close relationship between MIRI, CFAR, and the rationalist community has been a subject of both praise and criticism. Supporters credit the community with incubating important ideas about AI safety and attracting talented researchers. Critics have pointed to concerns about insularity, unconventional social dynamics, and allegations of cult-like behavior. In 2025, Anna Salamon told NBC News: "We didn't know at the time, but in hindsight we were creating conditions for a cult."
MIRI's research output and methodology have been subjects of ongoing debate. Open Philanthropy's commissioned reviews in 2016-2017 concluded that MIRI's Agent Foundations research had produced limited results and that the research direction was unlikely to substantially reduce AI risk. Some outside researchers have criticized MIRI's approach as overly abstract, disconnected from practical machine learning, and unlikely to produce actionable results.
Critics like Nora Belrose have publicly questioned MIRI's credibility, arguing that the organization's reasoning about AI risk rests on unsubstantiated assumptions.
Defenders of MIRI counter that the organization was raising alarms about AI risk years before it became a mainstream concern, and that many of the conceptual frameworks now used in alignment research, including corrigibility, embedded agency, and logical uncertainty, were developed or formalized at MIRI.
Yudkowsky's April 2022 "Death With Dignity" post, in which he expressed the view that solving alignment in time was essentially hopeless and that MIRI's team estimated humanity's survival probability at below 5%, drew both support and sharp criticism. Some viewed the post as a necessary act of honesty about the severity of the situation. Others argued that such extreme pessimism risked becoming self-fulfilling, potentially discouraging talented researchers from entering the field or funders from supporting alignment work.
MIRI's adoption of a nondisclosed-by-default research policy starting around 2018 reduced the organization's published output significantly. While the policy was intended to prevent the accidental release of information that could accelerate dangerous AI capabilities, it made it difficult for outside observers to evaluate whether MIRI's research was producing useful results. The policy has been cited as a factor in the perception that MIRI became less relevant to the broader alignment research community during this period.
The overlapping social networks of MIRI, CFAR, and the Bay Area rationalist community have attracted scrutiny. Concerns have been raised about power dynamics, mental health impacts on community members, and boundary issues. Bloomberg News and NBC News have published reporting on allegations of abuse and problematic dynamics within these interconnected communities.
Despite its relatively small size and budget compared to major AI labs, MIRI has had an outsized influence on the field of AI safety: