The UK AI Security Institute (AISI) is a research organization within the United Kingdom's Department for Science, Innovation and Technology (DSIT) that conducts pre-deployment evaluations of frontier AI models and produces safety and security research to inform government policy. It is widely regarded as the world's first government body dedicated to the systematic technical evaluation of advanced AI systems.
Originally established as the UK AI Safety Institute on 2 November 2023, following Prime Minister Rishi Sunak's announcement at the AI Safety Summit held at Bletchley Park, the organization was renamed the UK AI Security Institute on 14 February 2025, when Technology Secretary Peter Kyle announced a strategic refocusing toward serious AI risks with national security implications. The institute's operational home page is at aisi.gov.uk, and it retains the AISI abbreviation under the new name.
As of 2025, the institute employs more than 100 technical staff drawn from industry, academia, and civil society organizations. It operates with annual funding of approximately £66 million, has access to more than £1.5 billion in computing resources through the national AI Research Resource, and has conducted pre-deployment evaluations on more than 30 state-of-the-art AI models since its founding. The institute chairs an international network of AI safety and security institutes spanning eleven countries and the European Union.
The institution that became the UK AI Security Institute traces its direct origins to April 2023, when the UK government announced the formation of the Frontier AI Taskforce. At the time, the UK government committed an initial £100 million to fund the organization. The stated purpose of the Taskforce was to understand the capabilities and risks of frontier AI systems before more comprehensive regulatory structures could be established.
Ian Hogarth, a technology investor and entrepreneur best known for his 2023 essay "We must slow down the race to God-like AI," was appointed to chair the Taskforce. Hogarth had argued publicly that the most advanced AI systems posed unique societal risks and that independent, government-backed evaluation was necessary to prevent AI developers from, as he put it, marking their own homework.
The Frontier AI Taskforce began recruiting technical staff during mid-2023 and started building evaluation frameworks and relationships with leading AI developers. During this period, Prime Minister Rishi Sunak secured informal commitments from Sam Altman of OpenAI, Demis Hassabis of Google DeepMind, and Dario Amodei of Anthropic to provide early access to their forthcoming models for safety testing. These voluntary commitments preceded any formal agreements and reflected the close dialogue the UK government had cultivated with AI company leadership during 2023.
The AI Safety Summit, held at Bletchley Park on 1 and 2 November 2023, was the first major intergovernmental conference focused specifically on the risks of frontier AI. The choice of Bletchley Park was deliberate: the site is famous for codebreaking work during the Second World War and holds symbolic resonance in British technology history.
The summit brought together delegations from 28 countries, including the United States, China, the European Union, Australia, Canada, France, Germany, India, Italy, Japan, and South Korea. On the first day of the summit, representatives from these countries and the EU signed the Bletchley Declaration, a joint statement affirming that AI should be developed and deployed in a manner that is safe, human-centric, trustworthy, and responsible. The Declaration acknowledged that frontier AI carries risks of intentional misuse, unintended loss of control, and potential for catastrophic or irreversible harm, and committed signatory nations to work together to understand those risks.
Critically for the AISI, the Bletchley Declaration also called for government-led evaluation of frontier models and endorsed the principle that AI companies should submit their most capable systems for independent testing before public release. Eight major AI developers -- including Anthropic, Google DeepMind, OpenAI, Meta, Microsoft, Amazon Web Services, Mistral AI, and Inflection AI -- signed a separate commitment to support pre-deployment evaluations of their forthcoming models.
On 2 November 2023, in his closing address to the summit, Prime Minister Rishi Sunak announced the formal launch of the UK AI Safety Institute as the successor to the Frontier AI Taskforce. Ian Hogarth remained in the Chair role. The AISI was explicitly designed, as the official launch document described, to be "like a startup in government": combining governmental authority and access to official information with the technical depth and operational speed more typical of a well-funded technology organization. The institute was positioned not as a regulator -- it had no enforcement powers -- but as an evaluation and research body whose findings would inform both UK policy and the broader international debate on AI governance.
The institute's mission, as stated on its official website, is to build "the world's leading understanding of advanced AI risks and solutions, to inform governments so they can keep the public safe."
Since its founding, the AISI has organized its activities around three interconnected functions.
Testing AI systems. The institute conducts pre-release evaluations of leading AI models in collaboration with the companies that develop them. These evaluations examine how capable a model is in domains that pose safety or security concerns, whether the model's safeguards against misuse hold up against adversarial attempts to circumvent them, and whether autonomous or agentic uses of the model could produce unintended harmful outcomes.
Research advancement. The institute conducts and funds in-house research on AI safety and security topics that it considers underdeveloped in the academic literature. This includes work on evaluation methodology, model alignment, control mechanisms for autonomous AI systems, and understanding how AI capabilities in sensitive domains are evolving over time. As of 2025, the institute has distributed more than £15 million in research grants to external academic groups.
Policy guidance. The institute provides technical briefings, reports, and advice to UK government departments and to allied governments participating in the international network. Its findings on model capabilities, safeguard robustness, and emerging risks feed directly into policy discussions about how to regulate and deploy frontier AI systems.
Ian Hogarth served as Chair of the organization from its inception as the Frontier AI Taskforce through at least 2025. Hogarth brought to the role both his public profile as an advocate for cautious AI development and his experience as a founder and investor in the technology sector. His appointment was seen as a signal that the UK government intended the institute to have genuine technical credibility and independence from narrow industry interests.
Geoffrey Irving serves as Chief Scientist. Irving spent eight years in machine learning research, during which he co-led the N2Formal neural network theorem proving team at Google Brain, led the Reflection Team at OpenAI working on AGI safety and language model alignment, and subsequently led the Scalable Alignment Team at Google DeepMind. He is best known academically for developing the "debate" approach to aligning AI systems with human values, a scalable alignment technique in which two AI systems argue opposing positions and human judges evaluate the quality of their reasoning.
Chris Summerfield serves as Research Director. Summerfield is a Professor of Cognitive Neuroscience at the University of Oxford and brings expertise in how biological intelligence relates to artificial systems. His academic background informs the institute's approaches to evaluating AI reasoning, persuasion, and influence capabilities.
Jade Leung serves as Chief Technology Officer and also holds the role of Prime Minister's AI Adviser. Leung previously led Governance at OpenAI and brings deep expertise in the policy and operational dimensions of responsible AI deployment.
Adam Beaumont was appointed Interim Director in October 2025, succeeding Oliver Ilott in the operational leadership role. Beaumont previously served as Chief AI Officer at GCHQ, the UK government's intelligence and cybersecurity agency. His appointment reflected the institute's increased focus on national security applications following the February 2025 renaming.
The institute's advisory board includes Yoshua Bengio, one of the founding researchers of modern deep learning and a prominent advocate for cautious AI governance.
The pre-deployment evaluation program is the institute's most operationally distinctive activity. It involves the institute receiving access to an AI model before the model is released publicly, conducting structured tests to assess its capabilities and vulnerabilities, and producing findings that inform both the developing company and the UK government.
The original voluntary commitments made at and around the Bletchley Summit were subsequently formalized through memoranda of understanding with major AI developers. The institute has signed such agreements with Anthropic, OpenAI, Google DeepMind, and Cohere, among others. These agreements give the institute early access to models under defined conditions and establish expectations for information sharing.
The institute evaluates models anonymously in published reports: findings describe model performance in terms of anonymized labels (Red, Purple, Green, Blue, Yellow) rather than naming specific products, a practice designed to focus public attention on systemic patterns rather than competitive rankings between companies.
Evaluations are structured around four primary risk domains: cybersecurity, chemical and biological misuse, autonomous AI behavior, and safeguard robustness.
Within each domain, the institute tests three dimensions of model behavior. Compliance measures whether the model declines to assist with requests that fall within the risk category. Correctness assesses the accuracy and practical utility of any assistance the model provides. Completion measures how successfully the model can carry a potentially harmful task through to a functional result.
Evaluators use both automated methods and expert human review. Automated evaluation allows systematic testing at scale across large numbers of prompts and scenarios. Human expert validation is used for more complex cases -- particularly in domains like cybersecurity and biology -- where assessing whether model output actually provides meaningful uplift to a malicious actor requires specialist judgment.
The institute also employs adversarial red-teaming to assess safeguards. Rather than accepting model refusals at face value, red-teamers attempt to circumvent the safeguard through jailbreak techniques, indirect prompting, or by exploiting gaps between what a model is trained to decline and what it can be induced to do with creative framing. A finding that a model's safety filter fails under adversarial conditions is treated as a more significant signal than raw compliance rates on straightforward harmful requests.
The institute's May 2024 advanced evaluations report, the first major published assessment covering multiple models tested before their release, found that the most capable models evaluated could solve more than half of capture-the-flag cybersecurity challenges designed for high school students but struggled with university-level challenges. In the life sciences domain, several models matched PhD-level performance on expert-written questions covering basic biology, advanced biological topics, and laboratory automation.
On autonomous capabilities, models evaluated in 2024 completed 20 to 40 percent of short-horizon software engineering tasks but failed all long-horizon tasks, hampered by syntax errors and insufficient self-testing. The institute characterized this as indicating a meaningful but not yet alarming level of autonomous capability.
All models evaluated in the 2024 report remained vulnerable to basic jailbreaks, with compliance rates on harmful requests reaching 28 percent even without adversarial prompting. The institute noted this as a significant baseline concern given that these same models would be deployed to millions of users.
In 2025, the institute published its inaugural Frontier AI Trends Report, drawing on two years of pre-deployment evaluations to characterize the trajectory of AI capability development across the six primary domains it monitors: chemistry and biology, cybersecurity, autonomous capabilities, safeguard robustness, loss-of-control risks, and societal impact.
The report's headline findings included a doubling of performance in some capability domains approximately every eight months. On cybersecurity benchmarks, models that completed apprentice-level tasks nine percent of the time in late 2023 were completing the same tasks fifty percent of the time by early 2025. In the life sciences domain, frontier models had surpassed the performance of human PhD experts on open-ended biology questions, and were generating detailed experimental protocols tailored to the recipient's level of knowledge.
Safeguard analysis in the Trends Report found that universal jailbreaks -- techniques that cause every tested system to comply with harmful requests -- existed in every system the institute had evaluated. However, the report also found that the amount of expert time required to discover effective jailbreaks was increasing for certain models and request categories, indicating that safeguard robustness was improving even if absolute robustness remained elusive.
On autonomous AI and loss-of-control risks, the report documented that self-replication success rates for models attempting to copy themselves across compute environments had increased from approximately five percent in 2023 to sixty percent by 2025. The institute reported no evidence of spontaneous sandbagging -- strategic underperformance by models in evaluation settings -- but noted it had developed specific detection methods for this behavior given its implications for the reliability of safety evaluations.
Among the institute's most significant published evaluations was its 2026 assessment of Anthropic's Claude Mythos Preview model. The evaluation focused on cybersecurity capabilities and found that Claude Mythos Preview was the first model tested that could complete expert-level capture-the-flag challenges seventy-three percent of the time. More significantly, it was the first model to complete "The Last Ones" simulation from start to finish in three out of ten attempts. The Last Ones is a 32-step simulation of a corporate network attack estimated to require twenty hours of effort by a human expert with more than a decade of experience.
The institute concluded that Claude Mythos Preview could autonomously attack small, weakly defended, and vulnerable enterprise systems when given network access. It recommended that organizations prioritize cybersecurity fundamentals such as security updates, multi-factor authentication, and robust access controls, while noting that its test environments lacked active defenders and real-world security tooling, meaning results might not directly extrapolate to hardened systems.
In May 2024, the institute released Inspect, an open-source Python framework for conducting large language model evaluations. The release made public the core software infrastructure underlying the institute's own evaluation practice and positioned the institute as a contributor to the broader ecosystem of AI evaluation tooling.
Inspect is installable via a single pip command and is released under the MIT license, making it freely available for commercial and non-commercial use. The framework supports evaluations covering reasoning, coding, knowledge assessment, agentic task completion, behavior analysis, and multi-modal understanding. It includes a collection of more than 200 pre-built evaluations that can be run against any supported model.
The framework provides a web-based visualization tool called Inspect View for monitoring and reviewing evaluation runs, as well as a VS Code extension for authoring and debugging evaluations. It supports flexible tool-calling capabilities including custom tools, MCP tools, and built-in utilities for bash execution, code interpretation, web search, and web browsing.
The institute subsequently expanded the Inspect ecosystem with InspectSandbox, a secure execution environment for agentic evaluations; InspectCyber, a specialized collection of cybersecurity evaluation tasks; and ControlArena, a library specifically designed for AI control experiments.
ControlArena, released in 2025, is an open-source research platform for studying whether safety protocols can prevent advanced AI models from taking actions that circumvent human oversight. It provides standardized environments in which researchers can test specific control techniques against AI systems modeled on frontier developer infrastructure.
ControlArena's default scenarios test four threat models relevant to the institute's alignment research: weight exfiltration (whether an AI model can copy its own parameters to an external location), secret exfiltration (whether a model can leak confidential information), rogue internal deployment (whether a model can spin up new instances of itself), and sabotage of safety research (whether a model can subtly corrupt the evaluation process it is being subjected to).
The platform has been used by researchers at Anthropic, Google DeepMind, Redwood Research, and other organizations. Its release reflects the institute's broader strategy of producing public goods for the AI safety research community rather than keeping evaluation infrastructure proprietary.
The institute organizes its research program into two broad categories: risk research, which characterizes the severity and trajectory of AI-related threats, and solutions research, which develops and tests approaches to mitigating those threats.
Domain-specific risk research covers six areas: cyber misuse, dual-use science (particularly in chemistry and biology), criminal misuse (including AI-enabled fraud and child sexual abuse material), autonomous AI systems (including loss-of-control and unintended harmful action), societal resilience (risks from widespread AI deployment affecting critical infrastructure or social cohesion), and human influence and manipulation.
In parallel, the institute conducts generalized research on the science of evaluations -- developing more rigorous statistical and methodological frameworks for measuring AI capabilities -- and on capability elicitation, the study of how to accurately surface the true performance ceiling of a model rather than its typical behavior under standard prompting.
Safeguard analysis examines whether defensive measures deployed by AI developers actually hold up under adversarial conditions. The institute uses adversarial machine learning techniques to probe whether model safeguards can be circumvented and characterizes the difficulty of finding effective jailbreaks as a proxy for safeguard robustness.
Control research develops protocols for preventing AI models from taking actions that circumvent human oversight even in the context of advanced AI systems that may have strategic capabilities. The institute uses mock environments replicating frontier developer infrastructure to test whether proposed control protocols would succeed against capable adversarial models.
Alignment research pursues approaches to ensuring AI systems behave in accordance with human values and intentions, including work on detecting deceptive AI behavior and establishing formal guarantees around model honesty. The institute's £15 million Alignment Project, launched in 2025, is one of the largest dedicated alignment research efforts globally.
The institute also conducts empirical persuasion research. In 2025, it published a large-scale study involving more than 76,000 participants examining the mechanisms through which AI-generated content influences human attitudes, a study the institute framed as groundwork for understanding AI-driven manipulation risks at societal scale.
On 14 February 2025, Technology Secretary Peter Kyle announced at the Munich Security Conference that the UK AI Safety Institute would be renamed the UK AI Security Institute. The announcement came three days after the AI Action Summit in Paris, which had convened under French presidency to advance the post-Bletchley agenda on AI governance.
Kyle framed the renaming as an alignment of the institute's title with its actual activities. He described the change as reflecting the government's view that the most pressing AI risks were those with direct security implications: cyber attacks, the potential for AI to assist in the development of chemical or biological weapons, criminal misuse including fraud and child sexual exploitation material, and risks to critical national infrastructure.
The renaming was accompanied by structural changes. A new criminal misuse team was created within the institute, partnering with the Home Office to research AI-enabled crime. The institute was tasked with playing a leading role in testing AI models from any developer, open or closed, that posed security-relevant risks. A new agreement with Anthropic was announced, described as part of a broader Sovereign AI initiative.
The February 2025 announcement explicitly stated that the institute would no longer prioritize work on AI ethical issues such as algorithmic bias or freedom of expression in AI applications, marking a deliberate narrowing of scope compared to the institute's original mandate.
The renaming generated significant commentary from AI governance researchers and civil society organizations.
The AI Now Institute issued a statement warning that the shift from safety to security framing risked applying "piecemeal or superficial scrutiny" to AI systems before they were ready for deployment in defense and national security applications. AI Now argued that frontier AI models carry cyber vulnerabilities that themselves pose national security risks, and that independent safety evaluation must be insulated from industry partnerships to retain credibility.
Other observers interpreted the renaming more pragmatically. Some analysts noted that the institute's actual evaluation work had always included substantial security-relevant content -- its cybersecurity evaluations, CBRN misuse assessments, and safeguard robustness testing all had direct security applications -- and that the name change formalized an existing emphasis rather than fundamentally redirecting the organization.
The IAPS (Institute for AI Policy and Strategy) noted in a broader analysis of AI safety institutes that the renaming marked "a pivot between two competing visions for AI governance: one that emphasizes long-term risk mitigation and public accountability, and the other prioritizing innovation, speed, and global competitiveness."
Parliamentary debate on the AI Security Institute took place on 24 February 2025, with members questioning whether the narrowed remit would leave important categories of AI risk -- particularly those relating to bias, transparency, and impacts on democratic processes -- without a dedicated institutional home.
The institute's ability to conduct pre-deployment evaluations depends on voluntary agreements with AI developers, since the UK has not yet enacted legislation requiring mandatory pre-deployment testing.
Anthropic. The AISI and Anthropic have maintained one of the most extensively documented evaluation partnerships. This relationship produced the institute's assessments of Claude 3.5 Sonnet, which was shared with UK AISI and METR before public release, and the 2026 Claude Mythos Preview evaluation. Anthropic and the AISI also collaborated on biosecurity red-teaming that identified dozens of vulnerabilities including new universal jailbreak paths, and on the largest backdoor data poisoning study conducted to date. A formal MOU between the two organizations was announced alongside the February 2025 renaming.
OpenAI. The UK AISI and the US AI Safety Institute (housed at NIST) conducted a joint pre-deployment evaluation of OpenAI's o1 model, the first major joint evaluation between the two institutes. This reflected both the maturation of the bilateral US-UK AI safety relationship and the specific interest both governments had in the capabilities of advanced reasoning models.
Google DeepMind. Google DeepMind has partnered with the AISI since the institute's inception in November 2023. The partnership deepened through a formal research partnership agreement announced in 2025, with DeepMind also being an active user of ControlArena for internal control research.
Cohere and others. The institute has signed MOUs with Cohere and has conducted evaluations of models from additional developers under terms that allow the institute to publish findings but typically not to publicly identify the specific model or developer without agreement.
In early 2025, the institute partnered with Gray Swan AI to run the UK AISI Agent Red-Teaming Challenge, described at the time as the largest public evaluation of agentic large language model safety conducted to date. The challenge ran from 8 March to 6 April 2025 and invited external researchers to attempt to find safety failures in deployed agentic AI systems.
Participants made approximately 1.8 million attempts to break agent behaviors, resulting in 62,000 successful exploits across 22 different large language models targeting 44 specific categories of harmful agentic behavior. The results provided the institute with a large dataset on the types of agent safety failures that adversarial users could reliably produce and informed subsequent research on agentic AI safeguards.
The UK AI Safety Institute was the first national government body of its kind when it launched in November 2023. Within eighteen months, analogous institutions had been established or were being established across more than a dozen countries, and the UK institute had taken a central role in coordinating their activities.
US AI Safety Institute. The United States established its AI Safety Institute within the National Institute of Standards and Technology (NIST) in early 2024. The US AISI drew on NIST's existing technical expertise in standards and measurement, positioning it as more oriented toward standards-setting than the UK institute's primarily evaluation-focused mandate. In April 2024, the UK and US institutes concluded a formal agreement to collaborate on at least one joint safety evaluation per year, with the joint evaluation of OpenAI's o1 model being the first deliverable under this agreement.
Japan AI Safety Institute. Japan established its AI Safety Institute in February 2024 under the Information Technology Promotion Agency, with a staff of approximately 23 people.
Singapore AISI. In May 2024, Singapore renamed its Digital Trust Centre as the Singapore AI Safety Institute, placing it under Nanyang Technological University with S$10 million in annual funding.
South Korea. South Korea announced in May 2024 that it would create an AI safety institute under the Electronics and Telecommunications Research Institute.
EU AI Office. The European Union's AI Office, founded in May 2024 as part of the broader AI Act implementation framework, is a member of the international network. Unlike the national institutes, the EU AI Office has regulatory powers: it can request information from frontier model providers and apply sanctions to enforce the AI Act's requirements. This regulatory authority gives the EU AI Office a different character from the purely advisory and research-oriented national institutes.
At the AI Seoul Summit in May 2024, world leaders from Australia, Canada, the European Union, France, Germany, Italy, Japan, the Republic of Korea, Singapore, the United Kingdom, and the United States signed the Seoul Statement of Intent toward International Cooperation on AI Safety Science, formally launching the International Network of AI Safety Institutes.
The network's stated objectives include accelerating the science of AI safety, promoting complementarity and interoperability among national evaluation methodologies, and fostering a common international understanding of AI safety approaches. The UK AISI has served as a founding anchor of the network, contributing evaluation methodology, the open-source Inspect framework, and bilateral research agreements as key inputs to the collective effort.
The network held its first in-person convening in San Francisco in November 2024, attended by representatives from Australia, Canada, the European Commission, France, Japan, Kenya, South Korea, Singapore, the UK, and the United States. The meeting produced the first joint testing exercise in the network's history, led by technical experts from the US, UK, and Singapore institutes.
In 2025, the AISI spearheaded the establishment of what it named the International Network for Advanced AI Measurement, Evaluation and Science (INAMES), an expanded coordination structure intended to provide more formal infrastructure for joint research and evaluation across member institutes.
| Organization | Country | Type | Regulatory power | Focus |
|---|---|---|---|---|
| UK AI Security Institute | United Kingdom | Government body | None | Safety and security evaluation, research |
| US AI Safety Institute (NIST) | United States | Government body | None | Evaluation, standards, measurement |
| EU AI Office | European Union | Regulatory body | Yes (AI Act) | Compliance, frontier model oversight |
| Japan AI Safety Institute | Japan | Government body | None | Evaluation, standards |
| Singapore AISI | Singapore | Academic/government | None | Evaluation, research |
| Apollo Research | Independent | Non-governmental | None | Alignment research, evaluations |
| METR | Independent | Non-governmental | None | Autonomous AI evaluation |
| Frontier Model Forum | Industry | Industry body | None | Industry-led safety standards |
The UK institute is distinguished from purely academic or civil society evaluation bodies by its governmental status, which gives it privileged access to security and intelligence information and official standing in international diplomacy. It is distinguished from the EU AI Office by the absence of regulatory powers: the institute cannot compel AI developers to participate in evaluations or to modify their systems in response to findings. It is distinguished from the US NIST institute by its more operational, startup-like character and its heavier emphasis on producing novel research rather than codifying standards.
Independent evaluation organizations such as Apollo Research and METR operate in adjacent spaces. METR (formerly ARC Evals) specializes in evaluating autonomous AI capabilities and has conducted pre-deployment evaluations of models from Anthropic and others, sometimes in coordination with the AISI. Apollo Research focuses on the question of whether AI systems exhibit deceptive or manipulative behaviors toward humans, a research question that overlaps substantially with the AISI's alignment and control research agenda.
The institute has received broadly favorable assessments from AI governance researchers who view independent governmental evaluation capability as a necessary component of responsible AI development at scale. The founding of the AISI was widely credited with shifting the international debate from whether governments should evaluate frontier AI systems to how such evaluations should be conducted.
The voluntary nature of the evaluation agreements has been a persistent subject of scrutiny. Critics have noted that AI developers retain significant control over what they share with the institute, when they share it, and how evaluation findings are communicated publicly. The institute's published evaluation reports do not name specific models, a practice that some observers argue limits the public accountability value of the findings even as it enables the cooperative relationships that make the evaluations possible.
The non-binding character of the institute's guidance has also been noted. Evaluation findings inform but do not legally constrain AI developers. A company that receives an evaluation indicating significant safeguard vulnerabilities is not required to delay release or implement specific mitigations. The institute's influence operates through reputational pressure, government policy channels, and the voluntary commitments developers have made to participate in good faith.
Methodological standardization is another open question. The IAPS analysis of first-wave AI safety institutes noted that evaluation methods are not standardized across the institute's own evaluations or across the international network, making it difficult to compare findings over time or across organizations. The Inspect framework represents a partial response to this concern -- by open-sourcing its evaluation infrastructure, the institute has made it easier for others to use similar methodologies -- but does not resolve the deeper question of what constitutes a definitive assessment that a given model is or is not safe.
The February 2025 renaming raised concerns among some researchers that the AISI's narrowed focus on security-oriented risks would leave other important categories of AI harm -- including algorithmic bias, labor displacement, democratic manipulation, and risks to freedom of expression -- without a dedicated governmental evaluation body in the UK.
The institute's 2025 year-in-review highlighted the scale of its research activities: 30-plus model evaluations, a large-scale persuasion study with more than 76,000 participants, a major agent red-teaming challenge identifying 62,000 vulnerabilities, and the publication of the Frontier AI Trends Report drawing on the institute's full two-year evaluation dataset. These activities have established the AISI as the most productive government AI evaluation body by publication volume and operational scale.
The institute operates with annual funding of approximately £66 million from the Department for Science, Innovation and Technology. This represents a substantial increase from the £100 million initial lump-sum commitment made for the Frontier AI Taskforce in April 2023, which covered a broader initial build-out period.
Beyond direct budget allocation, the institute has access to more than £1.5 billion in high-performance computing resources through the national AI Research Resource, including priority access to the Isambard-AI supercomputer. This compute access enables the institute to conduct its own model training and evaluation experiments at scales that would otherwise be cost-prohibitive for a government research body.
The institute has distributed more than £15 million in external research grants through the Alignment Project and an additional £8 million through Systemic Safety Grants and a £5 million Challenge Fund. These grant programs are designed to fund alignment and safety research at universities and independent research organizations that the institute considers strategically important but that fall outside its own core operational priorities.