AI Safety Institutes are government-established organizations dedicated to evaluating, testing, and researching the safety of advanced artificial intelligence systems. These institutes emerged as a coordinated international response to the rapid development of frontier models, with the first wave of institutes launched in late 2023 and early 2024 following the Bletchley Park AI Safety Summit. As of early 2026, at least ten countries and the European Union have established or designated national AI safety institutes, and these bodies participate in a formal international network to share research, develop common evaluation methodologies, and coordinate safety testing of the most capable AI systems.
The push to create dedicated government institutions for AI safety gained momentum in 2023, driven by growing concerns about the capabilities of large language models and other advanced AI systems. Breakthroughs in generative AI, particularly the release of models such as GPT-4 and Claude, prompted governments worldwide to consider how they might evaluate the risks posed by increasingly powerful AI before these systems were deployed to the public.
Before the establishment of formal AI safety institutes, some government bodies had already begun exploring AI risk. In the United Kingdom, the Frontier AI Taskforce was created in early 2023 to build the first team within a G7 government capable of evaluating the risks of frontier AI models. The taskforce was chaired by Ian Hogarth, a technology investor and entrepreneur who had written prominently about the risks of advanced AI. This taskforce served as the direct predecessor to the UK AI Safety Institute.
In the United States, the National Institute of Standards and Technology (NIST) had been working on AI risk management frameworks since 2021, culminating in the release of the NIST AI Risk Management Framework (AI RMF 1.0) in January 2023. President Biden's Executive Order 14110 on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, signed on October 30, 2023, formally directed the creation of a US AI Safety Institute within NIST.
The first global AI Safety Summit took place on November 1-2, 2023, at Bletchley Park in Buckinghamshire, England. The summit was convened by UK Prime Minister Rishi Sunak and brought together representatives from 28 countries, the European Union, and leading AI companies.
The landmark outcome of the summit was the Bletchley Declaration, endorsed by all 28 participating countries including the United States, China, the European Union, and others such as Brazil, France, India, Japan, Kenya, Nigeria, Saudi Arabia, and the United Arab Emirates. The declaration focused on identifying shared AI safety risks and building risk-based policies to ensure AI benefits could be "harnessed responsibly for good and for all."
The summit had five guiding aims:
At the summit, Prime Minister Sunak announced that the Frontier AI Taskforce would be transformed into the UK AI Safety Institute (AISI). Ian Hogarth continued as chair of the new institute. The announcement positioned the UK as the first country to establish a dedicated government body for evaluating frontier AI safety, giving it an early leadership role in the emerging international AI safety architecture.
Following the Bletchley Park summit, a wave of countries established their own AI safety institutes. These organizations vary in structure, mandate, and scope, but share common goals around evaluating AI risks, developing safety standards, and participating in international cooperation.
The UK AI Safety Institute was the first of its kind, formally launched in November 2023 as an evolution of the Frontier AI Taskforce. It operated within the Department for Science, Innovation and Technology (DSIT) under the chairmanship of Ian Hogarth.
The institute was tasked with testing new types of frontier AI before and after release, evaluating risks ranging from social harms like bias and misinformation to extreme risks such as loss of human control over AI systems. The AISI built technical capabilities for model evaluation, including developing the Inspect framework, an open-source tool for large language model evaluations.
One of the most significant technical contributions of the UK AISI has been the development of Inspect, an open-source evaluation framework released in May 2024. Inspect supports a broad range of evaluations measuring coding ability, agentic task completion, reasoning, knowledge, behavioral alignment, and multimodal understanding. The framework includes over 100 pre-built evaluation benchmarks and has been adopted by organizations including METR, Apollo Research, other national AI safety institutes, and major AI safety research labs. Over 50 external contributors have added to the project.
The UK AISI conducted pre-deployment evaluations of several major AI models. It evaluated Anthropic's upgraded Claude 3.5 Sonnet across four domains: biological capabilities, cyber capabilities, software and AI development, and safeguard efficacy. For cybersecurity evaluation, the AISI tested the model against a suite of 47 challenges (15 publicly available and 32 privately developed). The upgraded Claude 3.5 Sonnet solved 36% of tasks at the "cybersecurity apprentice" level, compared to 29% for the best reference model evaluated.
The UK AISI also participated alongside the US AISI in evaluating OpenAI's o1 model before its public release. The two institutes ran separate but complementary tests assessing capabilities across cyber, biological, and software development domains, comparing o1's performance to reference models including o1-preview, GPT-4o, and both versions of Claude 3.5 Sonnet.
On February 14, 2025, UK Technology Secretary Peter Kyle announced at the Munich Security Conference that the AI Safety Institute would be renamed the AI Security Institute. This change reflected a strategic shift under the Labour government toward focusing on serious AI risks with direct security implications, such as the use of AI to develop chemical and biological weapons, conduct cyberattacks, or enable crimes like fraud and child sexual abuse.
The newly branded AI Security Institute launched a criminal misuse team working jointly with the Home Office on crime and security research. Peter Kyle stated that the institute's work "does not centre on freedom of speech or deciding what counts as bias and discrimination," signaling that broader ethical questions would be handled elsewhere within the government. Critics, including the AI Now Institute, expressed concern that the narrower focus could leave important safety issues unaddressed.
Alongside the rebrand, the UK government signed a Memorandum of Understanding with Anthropic to explore the use of Claude in UK public services, with the AI Security Institute continuing to evaluate Anthropic's models for security risks.
The US AI Safety Institute was established within the National Institute of Standards and Technology (NIST), part of the Department of Commerce, following President Biden's Executive Order 14110 in October 2023.
In February 2024, Commerce Secretary Gina Raimondo named Elizabeth Kelly as the inaugural director of the US AI Safety Institute. Kelly had previously served as special assistant to the president for economic policy at the White House National Economic Council and was one of the lead drafters of Biden's AI executive order. In September 2024, TIME magazine included Kelly on its list of the 100 most influential people in AI.
Also in February 2024, the US government announced the creation of the AI Safety Institute Consortium (AISIC), bringing together more than 280 organizations to develop science-based guidelines and standards for AI measurement and safety. Consortium members included major technology companies such as Adobe, OpenAI, Meta, Amazon, Apple, Google, Anthropic, Salesforce, IBM, Nvidia, Intel, Palantir, and Databricks, alongside financial institutions (Bank of America, JPMorgan Chase, Citigroup, Wells Fargo, Mastercard), academic institutions, civil society organizations, and state and local governments.
The US AISI secured agreements with both OpenAI and Anthropic to test their AI models before public release. These agreements allowed AISI researchers to access frontier models for safety evaluation, with findings shared with the developers before deployment. The US and UK AISIs also signed a bilateral partnership agreement in April 2024 to work together on developing and conducting safety tests.
Elizabeth Kelly departed her role as director in early February 2025, shortly after President Trump took office. Trump had revoked Biden's 2023 executive order on AI safety on his first day in office, raising immediate questions about the institute's future.
In June 2025, the institute was renamed the Center for AI Standards and Innovation (CAISI), with Commerce Secretary Howard Lutnick stating: "For far too long, censorship and regulations have been used under the guise of national security. Innovators will no longer be limited by these standards." CAISI was directed to focus on evaluating frontier AI systems for national security risks, particularly concerning cyberattacks and the development of chemical, biological, radiological, and nuclear weapons.
The Trump administration's AI Action Plan also directed NIST to revise the AI Risk Management Framework to "eliminate references to misinformation, Diversity, Equity, and Inclusion, and climate change." Reports indicated that layoffs at NIST affected many of the staffers who had been working at the original AI Safety Institute.
Japan established its AI Safety Institute on February 14, 2024, making it one of the earliest countries to follow the UK's lead. The institute was created within the Information-technology Promotion Agency (IPA), with the collaboration of ten ministries and five government-affiliated organizations. Akiko Murakami was named as its director.
J-AISI focuses on investigating evaluation methods for AI safety, developing standards for safety assessment, and implementing safety evaluation processes. The institute has worked to deepen cooperation with its counterparts in the United States and United Kingdom and has participated actively in the International Network of AI Safety Institutes.
Japan's engagement with AI safety governance builds on the Hiroshima AI Process, initiated during Japan's G7 presidency in 2023, which established voluntary AI governance principles and a code of conduct for organizations developing advanced AI systems.
On May 22, 2024, Singapore designated its Digital Trust Centre (DTC) as the country's AI Safety Institute. The DTC had been established in June 2022 at Nanyang Technological University (NTU) with an initial funding of S$50 million from the Infocomm Media Development Authority (IMDA) to lead research and development in trust technologies.
The Singapore AISI leverages the DTC's existing work in AI evaluation and testing. It focuses on addressing gaps in global AI safety science, with a particular emphasis on generative AI evaluations. Singapore was one of the initial members of the International Network of AI Safety Institutes and has been in active discussions with US and UK counterparts on collaborative evaluation programs.
South Korea's AI Safety Institute was formally inaugurated in November 2024, making it the sixth country to establish such a body. The institute was created within the Electronics and Telecommunications Research Institute (ETRI) and is located in Pangyo, south of Seoul.
The Korean AISI focuses on researching AI risks including technological limitations, human misuse, and potential loss of control over AI systems. At its inauguration, a Korea AI Safety Consortium agreement was signed involving 24 domestic industry, academic, and research organizations. The institute actively participates in the International Network of AI Safety Institutes.
The Canadian Artificial Intelligence Safety Institute was officially launched on November 12, 2024, by the Honourable Francois-Philippe Champagne, then Minister of Innovation, Science and Industry. CAISI received initial funding of $50 million (CAD) over five years, part of a broader $2.4 billion investment announced in Canada's Budget 2024.
CAISI is housed at Innovation, Science and Economic Development Canada and works in partnership with CIFAR (the Canadian Institute for Advanced Research) on its research program. The institute's mandate is to advance the science of AI safety in collaboration with international partners, ensuring that governments understand and can act on the risks posed by advanced AI systems.
France announced the establishment of its National Institute for AI Evaluation and Security (INESIA) on January 31, 2025. The announcement was made by Clara Chappaz, the French Minister for AI and the Digital Economy. Rather than creating an entirely new organization, INESIA brings together existing national players in AI evaluation and security, including the French National Agency for Information Systems Security (ANSSI), the French Institute for Research in Computer Science and Automation (INRIA), the National Laboratory of Metrology and Testing (LNE), and the Expert Centre for Digital Regulation (PEREN).
INESIA is led by the General Secretariat for Defence and National Security (SGDSN) under the Prime Minister's office, alongside the Directorate General for Enterprise (DGE).
The European AI Office, established within the European Commission, serves as the EU's central body for AI governance and functions as the EU's representative in the International Network of AI Safety Institutes. Unlike other national AI safety institutes, the EU AI Office has enforcement powers under the EU AI Act, including the ability to compel non-compliant providers of general-purpose AI models to take corrective measures.
The EU AI Office monitors, supervises, and enforces AI Act requirements on general-purpose AI models across all 27 EU member states. It analyzes emerging systemic risks from general-purpose AI development, conducts model evaluations, and investigates potential non-compliance incidents.
On November 25, 2025, the Australian Government announced the establishment of its AI Safety Institute, with operations expected to commence in early 2026. The institute received government funding of AUD $29.9 million and will provide expert capability to monitor, test, and share information on emerging AI technologies and their associated risks. Australia is set to join the International Network of AI Safety Institutes.
Germany's AI safety work is conducted through the Institute for AI Safety and Security, which operates as part of the German Aerospace Center (DLR). The institute develops AI-related methods, processes, algorithms, and technologies with a focus on safe and standard-compliant AI, cybersecurity, and autonomous systems. Germany was among the countries that agreed to form the international AI safety institute network at the Seoul Summit in May 2024.
| Country/Region | Institute Name | Established | Parent Organization | Leadership (Initial) |
|---|---|---|---|---|
| United Kingdom | AI Safety Institute (now AI Security Institute) | November 2023 | Department for Science, Innovation and Technology (DSIT) | Ian Hogarth (Chair) |
| United States | US AI Safety Institute (now Center for AI Standards and Innovation) | November 2023 (executive order); February 2024 (leadership appointed) | National Institute of Standards and Technology (NIST) | Elizabeth Kelly (Director, departed February 2025) |
| Japan | Japan AI Safety Institute (J-AISI) | February 2024 | Information-technology Promotion Agency (IPA) | Akiko Murakami (Director) |
| Singapore | Singapore AI Safety Institute | May 2024 | Digital Trust Centre / Nanyang Technological University / IMDA | N/A |
| South Korea | Korea AI Safety Institute | November 2024 | Electronics and Telecommunications Research Institute (ETRI) | N/A |
| Canada | Canadian AI Safety Institute (CAISI) | November 2024 | Innovation, Science and Economic Development Canada | N/A |
| France | National Institute for AI Evaluation and Security (INESIA) | January 2025 | General Secretariat for Defence and National Security (SGDSN) | N/A |
| European Union | EU AI Office | 2024 | European Commission | N/A |
| Germany | Institute for AI Safety and Security | Pre-2024 | German Aerospace Center (DLR) | N/A |
| Australia | Australian AI Safety Institute | November 2025 (announced) | Department of Industry, Science and Resources | N/A |
| Kenya | AI Safety representative body | 2024 (joined international network) | Government of Kenya | Ambassador Philip Thigo (Special Envoy for Technology) |
The formal international coordination of AI safety institutes was established at the AI Seoul Summit in May 2024, when ten countries and the European Union agreed to create the International Network of AI Safety Institutes. The network aims to accelerate the advancement of AI safety science through a common understanding of AI safety, aligned research efforts, and shared standards and testing methodologies.
The network was first announced by US Secretary of Commerce Gina Raimondo at the Seoul Summit. Initial members included the United States, United Kingdom, European Union, Japan, Singapore, South Korea, Canada, France, Kenya, and Australia.
Kenya's inclusion was notable as the only African nation in the network, represented by Special Envoy for Technology Ambassador Philip Thigo. The network's Capacity Building Programme has focused on establishing regional AI safety hubs in Africa, Southeast Asia, and Latin America.
The inaugural meeting of the network took place on November 20-21, 2024, in San Francisco. Representatives from all member countries convened to establish shared initiatives including a Joint Evaluation Protocol, Global AI Incident Database, and Open Safety Benchmarks Initiative.
In December 2025, the network was renamed the International Network for Advanced AI Measurement, Evaluation and Science (NAAIMES), dropping "Safety" from its title. This rebrand reflected a shift toward a more technical emphasis on how advanced AI models are measured, tested, and evaluated, moving away from what some members viewed as the broader and sometimes ambiguous concept of "AI safety."
The AI Safety Institutes emerged from, and have been shaped by, a series of international summits focused on AI governance and safety.
The first AI Safety Summit, held at Bletchley Park on November 1-2, 2023, was convened by UK Prime Minister Rishi Sunak. Twenty-eight countries and the EU signed the Bletchley Declaration, agreeing to cooperate on identifying and mitigating AI risks. The summit led directly to the creation of the UK AI Safety Institute and catalyzed similar efforts in other countries. Attendees also commissioned an International Scientific Report on Advanced AI Safety, to be led by Turing Award winner Yoshua Bengio.
The AI Seoul Summit took place on May 21-22, 2024, co-hosted by the United Kingdom and the Republic of Korea. The summit's principal outcomes included the Seoul Statement of Intent toward International Cooperation on AI Safety Science and the formal launch of the International Network of AI Safety Institutes. The summit expanded on the Bletchley Declaration, with participants endorsing voluntary commitments by frontier AI developers and agreeing to develop shared safety evaluation methodologies.
The third summit in the series, rebranded as the AI Action Summit, was held at the Grand Palais in Paris on February 10-11, 2025. Co-chaired by French President Emmanuel Macron and Indian Prime Minister Narendra Modi, the event drew over 1,000 participants from more than 100 countries.
Key outcomes included:
Anthropic CEO Dario Amodei described the summit as a "missed opportunity" for AI safety, reflecting broader concern in the AI safety community that the Paris event prioritized economic opportunity and inclusion over risk mitigation.
AI Safety Institutes perform a range of technical functions aimed at understanding and mitigating risks from advanced AI systems.
A core function of AI safety institutes is testing frontier AI models before they are released to the public. These evaluations assess whether models can be misused for harmful purposes, including generating instructions for creating biological or chemical weapons, assisting with cyberattacks, or producing child sexual abuse material. Evaluations also measure models' general capabilities in domains like coding, scientific reasoning, and autonomous task completion.
Pre-deployment evaluation depends on voluntary cooperation from AI developers. The US AISI secured formal agreements with OpenAI and Anthropic in 2024, and the UK AISI has tested models from both companies as well as Google's Gemini models.
Red teaming involves structured adversarial testing in which human evaluators or automated systems attempt to elicit harmful or dangerous outputs from AI models. AI safety institutes employ red teaming to probe models for vulnerabilities, test the robustness of safety guardrails, and identify failure modes that standard evaluations might miss.
The UK AI Security Institute has noted that every model it has tested is vulnerable to safeguard evasion attacks, highlighting the ongoing challenge of making AI systems reliably safe.
Institutes work to develop standardized benchmarks for evaluating AI safety, enabling consistent comparison across models and over time. The UK AISI's Inspect framework provides a shared platform for running these benchmarks. The US AISI, through the AISIC, works with its 280-plus member organizations to develop empirically backed guidelines for AI measurement.
AI safety institutes publish their findings to inform policymakers, industry, and the research community. The UK AISI has released multiple progress reports and published the results of individual model evaluations. The International AI Safety Report, published in January 2025 and coordinated by Yoshua Bengio, represents the most comprehensive collaborative research output from the global AI safety community, synthesizing the work of 96 experts across 30 countries.
The Frontier Model Forum (FMF) is an industry-supported nonprofit founded in 2023 by Anthropic, Google, Microsoft, and OpenAI. While structurally separate from government AI safety institutes, the FMF works closely with them.
The FMF joined the US AISI Consortium as a founding member in February 2024 and has supported the international network of AI safety institutes. The Forum established the AI Safety Fund (AISF), a collaborative initiative with over $10 million in funding to accelerate AI safety research. The AISF is supported by the four founding companies and philanthropic partners.
The FMF publishes research on AI safety best practices, including issue briefs on red teaming methodologies and components of frontier AI safety frameworks. It serves as a coordination mechanism between industry and government, facilitating information sharing about safety practices across sectors.
AI safety institutes face several significant challenges in fulfilling their missions.
With the exception of the EU AI Office, which has enforcement powers under the EU AI Act, AI safety institutes rely on voluntary cooperation from AI developers. Pre-deployment testing agreements are not legally binding in most jurisdictions, and companies can choose whether to submit their models for evaluation. This raises concerns about whether institutes can maintain access to the most capable models, particularly as competitive pressures intensify.
The speed of AI development far outpaces the capacity of safety institutes to evaluate new models. As of early 2025, AI models were demonstrating PhD-level performance on chemistry and biology question sets, and the safeguards designed to prevent harmful outputs were not always sufficient. Institutes must continuously update their evaluation methodologies to account for new capabilities and novel risk vectors.
The rebranding of both the US and UK institutes under new political administrations demonstrates the vulnerability of these organizations to shifting political priorities. The US institute's transformation from the AI Safety Institute to the Center for AI Standards and Innovation, accompanied by significant staff reductions, raised concerns about the continuity of safety-focused work. Similarly, the UK's renaming to the AI Security Institute, with its narrower focus on national security, left questions about who would address broader safety and ethical concerns.
While the international network provides a framework for cooperation, differences in national priorities, legal frameworks, and political contexts make coordination challenging. The December 2025 renaming of the network itself, dropping "Safety" from its title, reflected tensions between members about the appropriate scope and framing of their shared work.
AI safety institutes compete for talent with well-funded private AI laboratories. Government salaries and resources are typically far lower than those offered by frontier AI companies, making it difficult to recruit and retain the technical expertise needed for rigorous model evaluation.
| Date | Event |
|---|---|
| January 2023 | NIST releases AI Risk Management Framework (AI RMF 1.0) |
| Early 2023 | UK establishes the Frontier AI Taskforce, chaired by Ian Hogarth |
| July 2023 | Frontier Model Forum founded by Anthropic, Google, Microsoft, and OpenAI |
| October 30, 2023 | President Biden signs Executive Order 14110 on AI safety |
| November 1-2, 2023 | Bletchley Park AI Safety Summit; Bletchley Declaration signed by 28 countries; UK AI Safety Institute announced |
| February 14, 2024 | Japan AI Safety Institute (J-AISI) established |
| February 2024 | Elizabeth Kelly named first director of US AI Safety Institute; AISIC consortium launched with 280+ members |
| April 2024 | US and UK AI Safety Institutes sign bilateral partnership agreement |
| May 21-22, 2024 | AI Seoul Summit; International Network of AI Safety Institutes announced |
| May 22, 2024 | Singapore designates Digital Trust Centre as its AI Safety Institute |
| November 2024 | South Korea and Canada launch their AI Safety Institutes; first convening of the International Network in San Francisco |
| January 20, 2025 | President Trump revokes Biden's AI executive order |
| January 29, 2025 | International AI Safety Report published, led by Yoshua Bengio |
| January 31, 2025 | France announces INESIA |
| February 5, 2025 | Elizabeth Kelly departs as US AISI director |
| February 10-11, 2025 | Paris AI Action Summit |
| February 14, 2025 | UK AI Safety Institute renamed AI Security Institute by Peter Kyle at Munich Security Conference |
| June 2025 | US AISI renamed to Center for AI Standards and Innovation (CAISI) |
| November 25, 2025 | Australia announces establishment of its AI Safety Institute |
| December 2025 | International Network renamed to NAAIMES (International Network for Advanced AI Measurement, Evaluation and Science) |