AI energy consumption
Last reviewed
Jun 7, 2026
Sources
20 citations
Review status
Source-backed
Revision
v1 ยท 2,461 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 7, 2026
Sources
20 citations
Review status
Source-backed
Revision
v1 ยท 2,461 words
Add missing citations, update stale details, or suggest a clearer explanation.
AI energy consumption refers to the electricity, and the associated water, land, and emissions, required to train and operate artificial intelligence systems, principally generative AI and large language models running in data centers. After roughly fifteen years of nearly flat data-center electricity use, the build-out of AI accelerators since about 2022 has reversed that trend, turning compute infrastructure into one of the fastest-growing sources of new electricity demand in the United States and several other advanced economies. The topic is genuinely contested: headline figures vary by an order of magnitude depending on what is counted, the scenarios are forward projections rather than measurements, and the companies that hold the most accurate data disclose the least. This article surveys the demand side (how much, and the split between training and operation), the supply side (the grid, water, carbon, and the scramble for new generation), and the principal disputes.
Two reference points anchor most discussion. The International Energy Agency, in its April 2025 Energy and AI report, estimated that data centers consumed about 415 terawatt-hours (TWh) in 2024, roughly 1.5 percent of world electricity, and projected this would more than double to around 945 TWh by 2030 in its Base Case, just under 3 percent of global electricity, reaching about 1,200 TWh by 2035. The IEA attributes data centers to roughly one-tenth of global electricity demand growth to 2030, but more than 20 percent in advanced economies. For the United States specifically, the Lawrence Berkeley National Laboratory (LBNL) and US Department of Energy 2024 United States Data Center Energy Usage Report (Shehabi et al., December 2024) found US data centers used about 176 TWh in 2023, or 4.4 percent of national electricity, up from 58 TWh in 2014, and projected a range of 325 to 580 TWh by 2028, equivalent to 6.7 to 12 percent of US electricity. The report noted that US data-center power demand more than doubled between 2017 and 2023, largely because of AI servers.
Other bodies project higher. The Electric Power Research Institute (EPRI), in updated 2025 scenarios, estimated US data centers could consume 9 to 17 percent of national electricity generation by 2030, an upward revision of roughly 60 percent from its 2024 work. Goldman Sachs Research forecast in 2025 that global data-center power demand would rise about 50 percent by 2027 and as much as 165 percent by 2030 relative to 2023. The wide spread between these figures reflects genuine uncertainty about chip shipments, utilization, and how fast AI adoption translates into sustained load.
| Source (publication date) | Geography | Baseline | Projection | Key assumptions |
|---|---|---|---|---|
| IEA, Energy and AI (Apr 2025) | Global | 415 TWh / ~1.5% (2024) | ~945 TWh / ~3% by 2030; ~1,200 TWh by 2035 | Base Case; AI the leading driver; announced policies and projects |
| LBNL / DOE (Dec 2024) | United States | 176 TWh / 4.4% (2023) | 325 to 580 TWh / 6.7 to 12% by 2028 | Range spans low to high AI accelerator growth and utilization |
| EPRI, Powering Intelligence (updated 2025) | United States | ~4% (2023) | 9 to 17% of generation by 2030 | Scenario-based; ~60% above EPRI's 2024 estimate |
| Goldman Sachs Research (2025) | Global / US | 2023 levels | +50% by 2027; +165% by 2030 | US data-center share of power demand roughly doubling from ~4% |
| MIT Technology Review (May 2025) | United States | n/a | AI to use >50% of data-center electricity by 2028 | Drawing on LBNL data; AI alone ~equivalent to 22% of US households |
A recurring caveat applies to all of these: they are projections built on assumptions, not observed outcomes, and several pre-date the largest 2025 and 2026 capacity announcements. They should be read as ranges, not point estimates.
AI energy splits into training (the one-time cost of building a model) and inference (the recurring cost of running it for users). Training a frontier model is a large, concentrated draw; press estimates for individual runs reach the tens of gigawatt-hours, though leading labs such as OpenAI do not publish exact figures. The more important point for the grid is that, in aggregate, inference now dominates. MIT Technology Review's May 2025 analysis concluded that inference accounts for roughly 80 to 90 percent of AI computing power, a share expected to grow as products embed AI into search, productivity software, and agents. Other analysts using a different accounting put inference nearer 60 percent of the AI energy footprint, with training near 30 percent and fine-tuning the remainder; either way, serving models, not building them, is the larger and faster-growing load.
Per-query figures became a flashpoint in 2025. In August 2025 Google published a methodology stating that the median Gemini text prompt uses about 0.24 watt-hours of energy, emits 0.03 grams of CO2-equivalent, and consumes about 0.26 milliliters of water. OpenAI's Sam Altman wrote in June 2025 that an average ChatGPT query uses about 0.34 watt-hours. Both are far below the widely circulated earlier estimate of roughly 2.9 watt-hours per ChatGPT query. The gap is largely about scope: company figures tend to report a marginal or median request and may exclude idle capacity, networking, and the embodied energy of hardware, while measured open-model results vary enormously by model size and modality.
| Task (source) | Energy estimate | Notes |
|---|---|---|
| Median Gemini text prompt (Google, Aug 2025) | 0.24 Wh | Plus 0.03 gCO2e and 0.26 mL water |
| Average ChatGPT query (Altman, Jun 2025) | 0.34 Wh | Company-stated average |
| Earlier ChatGPT estimate (2024, third-party) | ~2.9 Wh | Often cited; methodology disputed |
| Llama 3.1 8B response (MIT TR, measured) | ~0.03 Wh | Small open model, GPU plus overhead |
| Llama 3.1 405B response (MIT TR, measured) | ~1.9 Wh | Large open model |
| 5-second AI video (MIT TR, measured) | ~940 Wh | Hundreds of times an image; modality matters most |
The binding constraint is increasingly not chips but interconnection: the ability to connect new load and new generation to the grid. US interconnection queues held on the order of 2,600 gigawatts of proposed generation and storage in early 2026, far more than will be built, with multi-year waits. In Texas, the ERCOT large-load queue ran to roughly 410 GW, the majority of it data centers. Long-lead equipment compounds the delay: lead times for large power transformers stretched from roughly two years before 2020 to about five years, and combined-cycle gas turbine deliveries from the major OEMs, including GE Vernova, pushed out to five to seven years against multi-year order backlogs. These bottlenecks, more than electricity prices, are what slow projects and have driven the AI data center moratorium debates in several US localities.
Siting matters as much as scale. Because operators cluster facilities where land, fiber, and permits align, the load lands unevenly on regional grids, and MIT Technology Review reported, drawing on academic work, that data-center electricity carries about 48 percent higher carbon intensity than the US grid average, partly because much of it sits on gas-heavy or coal-heavy systems. The result is local: rising wholesale prices, deferred plant retirements, and ratepayer disputes over who pays for new transmission.
Cooling high-density AI racks consumes water, both directly through evaporative cooling at the facility and indirectly through thermoelectric power generation upstream. Direct figures are sparse because disclosure is voluntary. Google reported using more than 5 billion gallons of water across its data centers in 2023, with a meaningful share drawn from water-stressed watersheds. Academic work by Shaolei Ren and colleagues at UC Riverside estimated that a short ChatGPT conversation (on the order of 20 to 50 exchanges, GPT-3 era) could correspond to roughly 500 milliliters of water once upstream generation is included, and projected global AI-related water withdrawal of 4.2 to 6.6 billion cubic meters by 2027. Industry analysts have put 2025 AI data-center water use near 1 trillion liters, but such top-down estimates carry wide error bars.
The engineering response is to take water out of the loop. Microsoft has deployed closed-loop, zero-water-evaporation cooling at its Fairwater campus in Mount Pleasant, Wisconsin, which the company says avoids more than 125 million liters of evaporative water per facility each year, and is researching in-chip microfluidic cooling to remove heat closer to the silicon. Direct-to-chip liquid cooling is becoming standard for the densest Nvidia GPU racks, where air cooling no longer suffices. As with PUE for energy, the industry tracks water usage effectiveness (WUE), though closed-loop designs can shift consumption from on-site water to additional electricity.
AI growth has visibly strained hyperscaler climate commitments. Google reported that its greenhouse-gas emissions rose about 48 percent between 2019 and its 2024 reporting, driven by data-center energy and supply-chain emissions, and in that report it stopped describing itself as maintaining operational carbon neutrality and reframed its targets as ambition-based. Microsoft reported fiscal-2024 emissions roughly 23 percent above its 2020 baseline, again attributing the rise to AI and cloud expansion, while characterizing its carbon-negative-by-2030 goal as a marathon. Both companies continue to sign large clean-energy contracts; Microsoft said it procured about 19 GW of new renewable capacity in 2024. The tension is structural: emissions are rising in absolute terms even as the companies buy record volumes of carbon-free power, because demand is growing faster than clean supply can be added.
Efficiency has improved on two fronts. Facility overhead, measured by power usage effectiveness (PUE, the ratio of total facility energy to IT energy), has fallen sharply at the leaders: Google reported a fleet-wide PUE of about 1.09 in 2024 against an industry average near 1.56 from the Uptime Institute, and other hyperscalers report similar figures, though the global fleet remains far less efficient. Chip efficiency has also risen rapidly, with each GPU generation delivering more computation per watt. These gains are real, but they reduce energy per unit of work, not total energy.
The supply scramble has been the most visible part of the story since late 2024, and nuclear has become its emblem. In September 2024 Microsoft signed a 20-year power-purchase agreement with Constellation Energy to restart Unit 1 at Three Mile Island, rebranded the Crane Clean Energy Center, providing about 835 MW; in November 2025 the US Department of Energy advanced a roughly $1 billion loan to support the restart, targeted for around 2027. Google agreed in October 2024 to buy power from small modular reactors (SMRs) built by Kairos Power, up to about 500 MW across six to seven units with a first deployment near 2030. Amazon led investments of more than $500 million in X-energy and signed SMR agreements in Washington and Virginia, alongside a data-center campus next to the Susquehanna nuclear plant. Meta issued a request for proposals for 1 to 4 GW of new nuclear, signed a 20-year agreement with Constellation to extend the 1.1 GW Clinton plant in June 2025, and contracted with TerraPower, Oklo, and Vistra around its Ohio build-out.
Nuclear, however, mostly arrives late in the decade. The near-term workhorse is natural gas, despite turbine backlogs, supplemented by renewables, batteries, and on-site generation. Fuel-cell maker Bloom Energy has won data-center deals for behind-the-meter power that sidesteps the interconnection queue, and power-first developers such as Crusoe Energy build generation and compute together. The practical outcome through about 2028 is a mix that leans on gas while clean firm capacity is contracted for the 2030s.
Three disputes run through the subject. The first is measurement: per-query and per-facility figures depend heavily on system boundaries, so a 0.24 watt-hour median prompt and a multi-watt-hour large-model query can both be true for different questions, and critics argue that selective company disclosures risk understating the full footprint. The second is the rebound effect, or Jevons paradox: efficiency that makes inference cheaper tends to expand usage, so falling energy per query can coincide with rising total energy, an effect documented for AI in 2025 academic work. The third is the projections themselves, which assume continued exponential adoption; a slower-than-expected return on AI investment, or faster algorithmic efficiency, could leave forecasts high, while a faster build-out could leave them low. What is not seriously disputed is the direction: after a long plateau, AI has made data centers a material and growing claim on electricity, water, and the grid, and the size of that claim through 2030 remains an open, actively contested question.