Compute governance
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 4,601 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 4,601 words
Add missing citations, update stale details, or suggest a clearer explanation.
Compute governance is a policy framework that uses regulation of the computational resources used to train and run artificial intelligence systems as a primary lever for governing advanced AI. The framework treats compute (the hardware accelerators such as graphics processing units, tensor processing units, and specialized AI chips, along with the data centers and cloud services that aggregate them) as the most tractable of the three canonical AI inputs alongside data and algorithms.[^1] Proponents argue that compute is uniquely governable because it is physically traceable, produced through an extremely concentrated supply chain, and quantifiable in floating-point operations (FLOP), allowing policymakers to use it as a proxy for AI capability and risk.[^1] Major instruments in the contemporary compute governance toolkit include United States export controls on advanced chips and chipmaking equipment, training-compute reporting thresholds adopted in the United States and the European Union, validated end-user programs that license overseas data centers, and proposed hardware-enabled governance mechanisms that would build verification primitives directly into AI accelerators.[^1][^2][^3][^4]
The conceptual basis for compute governance grew out of observations about the role of computational resources in driving the capabilities of modern machine learning systems. Empirical scaling work in the late 2010s and early 2020s, including the Kaplan et al. scaling laws and follow-on Chinchilla scaling analyses, established that loss decreases predictably with compute, parameters, and data, which in turn produced a policy-relevant heuristic: more compute generally yields more capable models.[^1] As training runs grew by roughly an order of magnitude every twelve to eighteen months across the 2018 to 2024 period, the largest AI laboratories converged on a small number of chip suppliers and a small number of cloud and on-premises operators, creating a supply chain with identifiable chokepoints.[^1][^2]
The canonical academic framing of compute governance is the paper "Computing Power and the Governance of Artificial Intelligence", posted to arXiv in February 2024 by Girish Sastry, Lennart Heim, Haydn Belfield, Markus Anderljung, Miles Brundage, and seventeen co-authors drawn from the Centre for the Governance of AI (GovAI), OpenAI, the Leverhulme Centre for the Future of Intelligence, the University of Cambridge, and other institutions.[^1] The paper argues that compute is "detectable, excludable, and quantifiable, and is produced via an extremely concentrated supply chain", which the authors contrast with data (which is non-rival and can be copied without detection) and algorithms (which are research outputs that can be published or independently reinvented).[^1] The same paper identifies three policy capacities that compute governance can provide: visibility (understanding which actors control sufficient compute to build advanced systems), allocation (steering compute toward beneficial uses or actors), and enforcement (restricting compute to prevent specified categories of development or deployment).[^1]
A related line of work by Lennart Heim at GovAI and later at RAND emphasizes three properties that make compute particularly amenable to regulation: rivalry (a chip used in one data center cannot be used elsewhere at the same time), excludability (chip access can be controlled by licensing or physical interdiction), and quantifiability (training runs can be measured in FLOP and chips can be counted as discrete physical objects).[^5] Heim's 2023 Carnegie talk and subsequent writings frame compute governance as a "knowledge, shaping, enforcement" stack that can be applied to actors ranging from individual developers to states.[^5]
Compute governance research typically begins with the proposition that AI development requires three inputs: data, algorithms, and compute.[^1] Each input is governable in principle, but each has distinct properties that affect how readily it can be regulated.
Data is the most diffuse of the three inputs. Modern frontier models train on tens of terabytes of text and other modalities drawn from publicly accessible web archives such as Common Crawl, licensed corpora, and synthetic data. Restricting one source rarely prevents training because substitutes exist. Algorithms are similarly hard to bottleneck: a research result published on arXiv can be implemented in any sufficiently equipped laboratory, and reverse-engineering or independent rediscovery is common. Compute, by contrast, depends on a multi-layered supply chain in which a small number of firms supply chip designs (NVIDIA, AMD, Google's tensor processing unit team, Amazon's Trainium and Inferentia chips, and a handful of competitors), Taiwan Semiconductor Manufacturing Company (TSMC) fabricates the leading-edge nodes, ASML produces the extreme-ultraviolet lithography machines required for those nodes, and the resulting accelerators are installed in data centers owned by a manageable number of cloud providers and AI laboratories.[^1][^2]
The paper "Computing Power and the Governance of Artificial Intelligence" notes that this supply-chain concentration is unusual for a major economic input and creates natural points at which governments can apply pressure.[^1] Subsequent work has explored how each layer (chip design, fabrication, lithography, assembly, networking, data center operation) supports different governance interventions.[^1][^5]
Compute governance is implemented through a portfolio of policy instruments rather than a single mechanism. The principal instruments adopted or proposed by 2026 fall into four categories: export controls, training-compute reporting thresholds, licensing and audit regimes for cloud providers and data centers, and hardware-enabled governance mechanisms.
The most consequential compute governance instrument in practice has been export controls on advanced computing chips and chipmaking equipment. The United States Bureau of Industry and Security (BIS) issued an interim final rule on October 7, 2022 titled "Implementation of Additional Export Controls: Certain Advanced Computing and Semiconductor Manufacturing Items; Supercomputer and Semiconductor End Use; Entity List Modification", which added high-performance computing chips and computer commodities containing them to the Commerce Control List and imposed new license requirements for items destined for supercomputer or semiconductor production end uses in the People's Republic of China.[^6] The rule introduced Export Control Classification Number 3A090 to capture advanced computing integrated circuits and used performance criteria (initially a combination of processing performance and interconnect bandwidth) to delineate which chips required a license.[^6]
The 2022 rule was widely understood to target Nvidia's data-center GPUs, including the A100 and H100, along with their AMD counterparts. In response, Nvidia introduced China-specific variants (the A800 and H800) that reduced interconnect bandwidth to fall just below the 2022 thresholds while preserving most of the raw compute performance.[^7] BIS responded on October 17, 2023 with a follow-on rule that restructured ECCN 3A090 to add a performance-density parameter and tightened the rules so that the A800 and H800 fell within the scope of the controls.[^7][^8] The 2023 update also expanded restrictions on semiconductor manufacturing equipment and added additional countries to the licensing perimeter.[^8]
On September 30, 2024, BIS expanded the Validated End User (VEU) authorization in 15 CFR 748.15 to create a Data Center VEU pathway, allowing pre-approved overseas data centers to receive controlled items without case-by-case licensing after a multi-agency review by the Departments of Commerce, Defense, Energy, and State.[^9] The program was designed to facilitate AI development at trusted facilities while preserving licensing controls over destinations in Country Group D:5.[^9]
On January 15, 2025, BIS issued the "Framework for Artificial Intelligence Diffusion", commonly called the AI Diffusion rule, which established a three-tiered structure for managing AI chip exports.[^10] Tier 1 comprised the United States and eighteen close allies (Australia, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, the Republic of Korea, Spain, Sweden, Taiwan, and the United Kingdom) and faced minimal additional restrictions.[^10] Tier 2 covered most of the rest of the world, with country-level caps (50,000 H100-equivalents through 2027 on a first-come basis) and per-company allotments (1,700 H100-equivalents per receiving company per year, not counting against country caps) administered through the Data Center VEU program.[^10] Tier 3 covered embargoed destinations and continued to face the strictest controls.[^10]
The AI Diffusion rule was rescinded by the Trump administration's Department of Commerce on May 13, 2025, on the grounds that the framework was excessively burdensome and risked undermining US AI leadership.[^11] BIS announced that it would draft replacement rules while continuing to enforce the underlying 2022 and 2023 advanced computing controls.[^11]
The second principal lever is the use of training compute, measured in FLOP, as a regulatory tripwire that triggers reporting, safety testing, or licensing obligations. The most influential reporting threshold in the United States was established by Executive Order 14110 on the "Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence", signed by President Joe Biden on October 30, 2023.[^12] Section 4.2 of the executive order directed the Secretary of Commerce, acting through the Bureau of Industry and Security, to require developers of dual-use foundation models to report planned training runs, red-team test results, and information about the physical and cybersecurity protections of their model weights.[^12] The initial trigger was a training run using more than 10^26 integer or floating-point operations, with a separate threshold of 10^23 operations for models trained primarily on biological sequence data, and a separate computing-cluster reporting threshold for facilities with more than 10^20 FLOP/s of theoretical compute interconnected at 100 Gbit/s or higher.[^12] A Biden administration official described the 10^26 FLOP threshold as intentionally set above the scale of then-current models so that "current models wouldn't be captured but the next generation state-of-the-art models likely would" be captured.[^13]
President Donald Trump rescinded EO 14110 on January 20, 2025 as part of an "Initial Rescissions of Harmful Executive Orders and Actions" order issued on his first day in office.[^14] On January 23, 2025, Trump issued Executive Order 14179, "Removing Barriers to American Leadership in Artificial Intelligence", which directed agencies to develop an Artificial Intelligence Action Plan within 180 days and to suspend or revise actions from EO 14110 inconsistent with the new policy of "sustaining and enhancing America's global AI dominance".[^14][^15]
The European Union adopted a different threshold in the EU AI Act. Article 51 of the Act presumes that a general-purpose AI (GPAI) model has "high-impact capabilities" and therefore poses systemic risk when the cumulative amount of computation used for its training exceeds 10^25 FLOP, an order of magnitude below the original US threshold.[^16] Providers must notify the European Commission within two weeks of "reasonably foreseeing" or reaching the threshold. A model classified as GPAI with systemic risk faces enhanced obligations, including adversarial testing, incident reporting to the Commission's AI Office, and model evaluation against state-of-the-art benchmarks.[^16] The AI Office is empowered to update the threshold by delegated act as the technology evolves.[^16]
In the United States, California Senate Bill 1047 (the "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act") would have applied to "covered models" trained using greater than 10^26 integer or floating-point operations at a training cost exceeding $100 million, along with fine-tuned models that used at least $10 million of additional compute. Governor Gavin Newsom vetoed SB 1047 on September 29, 2024, citing concerns that the bill applied "stringent standards to even the most basic functions" and that compute-based scoping "could create a false sense of security" by overlooking smaller, deployment-context-dependent risks.[^17] California's successor legislation, SB 53 (the "Transparency in Frontier Artificial Intelligence Act"), was signed by Newsom on September 29, 2025 and takes effect on January 1, 2026.[^18] SB 53 retains a 10^26 FLOP threshold for the definition of a "frontier model" but imposes lighter transparency and reporting obligations than SB 1047, with heavier disclosure duties reserved for "large frontier developers" whose annual revenue exceeds $500 million.[^18]
Compute governance also extends to the cloud and Infrastructure-as-a-Service (IaaS) layer through proposed "know your customer" (KYC) requirements. On January 29, 2024, BIS issued a Notice of Proposed Rulemaking, pursuant to Executive Order 13984 and EO 14110, that would require US IaaS providers to maintain a Customer Identification Program for foreign customers and to report transactions that could result in the training of "a large AI model with potential capabilities that could be used in malicious cyber-enabled activity".[^19] The proposed rule would also empower the Commerce Department to prohibit specified IaaS providers from offering services to designated foreign actors.[^19] As of the rescission of EO 14110 in January 2025, the rule had not been finalized.[^14][^19]
Independent of mandatory regimes, several frontier AI developers adopted voluntary, compute-informed governance frameworks. Anthropic's Responsible Scaling Policy and analogous policies from other laboratories use capability evaluations triggered by deployment milestones rather than pure compute thresholds, but they reference training-compute scale as one indicator that elevated AI Safety Levels may apply.[^20]
A more ambitious line of research seeks to embed governance primitives directly into AI accelerators. The Flexible Hardware-Enabled Guarantees (FlexHEG) program, described in a 2025 arXiv paper "Flexible Hardware-Enabled Guarantees for AI Compute" and an associated "Technical Options for Flexible Hardware-Enabled Guarantees" report, proposes a family of secure hardware mechanisms integrated into AI chips that would enable verifiable, privacy-preserving enforcement of rules governing compute usage.[^4][^21] The FlexHEG architecture has two principal components: an auditable Guarantee Processor that monitors accelerator usage and verifies compliance with specified rules, and a Secure Enclosure that protects against physical tampering.[^4] The system targets four properties: privacy-preserving compliance verification (so developers can demonstrate compliance without revealing sensitive information), flexibility (so verification rules can be updated as technology and governance needs evolve), trustworthiness (open-source and auditable to prevent backdoors), and security against physical and cryptographic circumvention.[^4]
Applications proposed for FlexHEG and related hardware-enabled governance mechanisms include privacy-preserving model evaluations, controlled deployment of high-risk model weights, compute limits for training runs, and automated enforcement of agreed safety protocols at the silicon level.[^4][^21] A January 2024 RAND working paper by Lennart Heim and colleagues, "Hardware-Enabled Governance Mechanisms: Developing Technical Solutions to Exempt Items Otherwise Classified Under Export Control Classification Numbers 3A090 and 4A090", explored how such mechanisms could serve as a basis for granular licensing inside otherwise controlled chip categories.[^5]
A more incremental form of on-chip governance has emerged from Nvidia and from academic research on location verification. In December 2025, Nvidia began piloting software that uses the attestation capabilities of its Blackwell-generation chips (B100, B200, and successors) to verify chip identity and to infer chip location from network-latency patterns rather than direct geolocation.[^22] The Institute for AI Policy and Strategy and other groups have proposed delay-based verification systems that confirm location via secure time-stamped signals as a means of enforcing export controls without requiring continuous network connectivity.[^22] These mechanisms are designed to deter and detect chip smuggling, including the diversion of controlled accelerators to embargoed destinations.[^22]
Compute governance frameworks pay particular attention to the structure of the AI hardware supply chain because each layer offers a different governance affordance. The dominant chip designer for AI training is Nvidia, whose data-center accelerators (the A100, H100, H200, B100, B200, and Blackwell generation more broadly) have powered most large training runs reported between 2020 and 2025.[^1] AMD competes with the MI300 and MI325 lines, while Google designs in-house tensor processing units (TPU v5p, Ironwood, and predecessors), Amazon develops Trainium and Inferentia, and Intel produces Gaudi accelerators. Fabrication of leading-edge chips is concentrated at TSMC, with Samsung and Intel Foundry as smaller competitors. ASML supplies the extreme-ultraviolet lithography machines required for nodes below 7 nanometers.[^1][^6]
The concentration of fabrication at TSMC and of lithography at ASML produces "natural chokepoints" that have been exploited by export-control regimes. The Netherlands and Japan adopted parallel export controls in 2023 that restricted ASML lithography and Tokyo Electron deposition tools, complementing the US BIS rules and bringing the principal chipmaking-equipment producers into a coordinated regime.[^8] Networking is a similar chokepoint: Nvidia's NVLink and InfiniBand interconnects, along with the optical networking used to scale AI clusters, have themselves been subject to controls because the per-chip bandwidth between accelerators determines whether a cluster behaves as one large machine or as a collection of smaller ones.[^7][^8]
A central empirical claim of compute governance is that training compute correlates closely enough with capability and risk that it can serve as a regulatory proxy. The argument draws on scaling-laws research showing predictable improvements in loss and downstream capabilities as training compute, parameters, and data are scaled together. Reports from frontier laboratories, model cards, and independent estimates by groups such as Epoch AI suggest that training compute for the largest models grew from roughly 10^23 FLOP for GPT-3 in 2020 to over 10^25 FLOP for GPT-4 by 2022 and to the high-10^25 to low-10^26 FLOP range for the most demanding 2024 to 2025 training runs.[^1]
The proxy claim is contested. Sara Hooker's July 2024 paper "On the Limitations of Compute Thresholds as a Governance Strategy" argues that hard-coded compute thresholds are "shortsighted and likely to fail to mitigate risk" because algorithmic efficiency gains (better data curation, instruction tuning, preference training, retrieval augmentation, tool use, longer context, and chain-of-thought reasoning) can yield large capability improvements with little or no additional training FLOP.[^23] Hooker draws an analogy to earlier US attempts to regulate exports of high-performance computers using Millions of Theoretical Operations per Second (MTOPS), an approach that became obsolete as architectures evolved past the metric.[^23] Hooker's recommended response is to shift from static thresholds to dynamic metrics that incorporate capability evaluations and to diversify the risk indicators used in governance.[^23]
A second critique concerns the difference between one-time reporting and continuous monitoring. EO 14110's reporting obligations applied at the time of a training run that crossed the threshold, but they did not require continuous evaluation as algorithmic improvements or post-training fine-tuning altered model capability. Critics have argued that this snapshot approach captures only a slice of model lifecycle risk.[^23] Defenders, including Heim and colleagues, have responded that compute thresholds are best understood as a "tripwire" mechanism that triggers further evaluation rather than as a complete risk signal in themselves.[^1][^5][^24]
A related line of work, including "Defending Compute Thresholds Against Legal Loopholes" (arXiv:2502.00003) and "Training Compute Thresholds: Features and Functions in AI Regulation" (arXiv:2405.10799), has sought to articulate principles for designing thresholds that combine compute, evaluation, and capability signals; that index to algorithmic-efficiency adjustments; and that are robust to circumvention through staged training or model decomposition.[^24]
The compute governance era as a coherent policy program is generally dated to the October 2022 BIS rule, although precursor instruments (Wassenaar Arrangement controls on dual-use technologies, prior MTOPS-based supercomputer export controls in the 1990s and 2000s) extend back several decades.[^6][^23] The principal events in the contemporary compute governance timeline include:
Compute governance has been the subject of an active debate among AI researchers, policy analysts, and civil-society groups. Supporters argue that compute is the most tractable lever available to states that want to slow or shape advanced AI development without requiring intrusive monitoring of research output or data flows.[^1][^5] The Centre for the Governance of AI (GovAI), RAND's Center on AI, Security, and Technology, the Center for Security and Emerging Technology (CSET) at Georgetown, the Institute for AI Policy and Strategy (IAPS), and Apollo Research have all produced substantive work supporting refined versions of compute-based governance.[^1][^5]
Critics have raised several lines of objection. The first concerns concentration of power: even if compute governance succeeds technically, it concentrates regulatory leverage in the hands of a few states and a few firms that operate the supply chain, with potential risks for competition policy and global equity.[^1][^23] The second concerns enforceability: as algorithmic efficiency improves, the ratio between compute and capability shifts, eroding the predictive power of any static FLOP threshold.[^23] The third concerns scope: compute thresholds capture frontier model training but miss capability gains achieved through inference-time compute, agent scaffolding, retrieval augmentation, or test-time reasoning that does not require additional training FLOP.[^23]
The DeepSeek episode in early 2025 sharpened these debates. The Chinese laboratory DeepSeek released V3 and R1 models that reportedly achieved competitive performance on reasoning benchmarks using approximately 2,788,000 GPU-hours on a cluster of about 2,048 Nvidia H800 accelerators, at a reported pretraining cost of approximately $5.6 million.[^25] Independent estimates, including from the consulting firm SemiAnalysis, disputed the reported hardware fleet and suggested that DeepSeek had access to a substantially larger combination of H100, H800, H20, and A100 accelerators, potentially obtained through diversion or pre-control purchases.[^25] The episode prompted debate about whether export controls had succeeded in slowing capability development, whether they had succeeded in raising costs without preventing capability convergence, or whether the visible price reduction in frontier capability indicated that compute thresholds were a weakening signal of risk.[^25]
As of May 2026, the compute governance regime is in a transitional state. The 2022 and 2023 BIS export controls on advanced computing chips remain in force, as do the underlying ECCN 3A090 classifications and the Validated End User framework. The January 2025 AI Diffusion rule was rescinded in May 2025, and BIS has indicated that replacement rules are under development.[^11] In the United States, Executive Order 14110 has been rescinded and replaced by Executive Order 14179, removing the 10^26 FLOP federal reporting threshold but leaving in place the BIS export-control regime and the Center for AI Standards and Innovation (formerly the US AI Safety Institute) within the National Institute of Standards and Technology.[^14][^15]
In the European Union, the 10^25 FLOP threshold for GPAI with systemic risk under the EU AI Act is now in force, with the General-Purpose AI Code of Practice operationalizing the obligations for providers above the threshold.[^16] California's SB 53 takes effect on January 1, 2026, applying transparency and reporting obligations to frontier developers training above 10^26 FLOP.[^18] The UK AI Security Institute (formerly the UK AI Safety Institute) continues to conduct compute-informed evaluations of frontier models under voluntary agreements with developers.
Internationally, the AI Safety Summit process has produced ongoing dialogues among the principal compute-producing and compute-consuming states. Coordinated controls on chipmaking equipment between the United States, the Netherlands, Japan, and the Republic of Korea remain in force, although individual rules continue to be revised.[^8]
The on-chip mechanism agenda, including FlexHEG and Nvidia's pilot attestation-based location verification, remains in research and limited-deployment status. No jurisdiction has yet mandated FlexHEG-style guarantees as a condition of chip sale, although the FlexHEG architecture is referenced in policy literature as a future enforcement substrate that could support both export-control compliance and compute-threshold reporting without revealing model details.[^4][^21][^22]