Gaming
Last reviewed
May 13, 2026
Sources
44 citations
Review status
Source-backed
Revision
v2 ยท 5,506 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 13, 2026
Sources
44 citations
Review status
Source-backed
Revision
v2 ยท 5,506 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Gaming ChatGPT Plugins
AI in gaming covers the artificial intelligence systems used inside games, the AI tools used to make games, and the AI agents that have learned to beat humans at games as research milestones. The field spans more than seventy years, starting with simple rule-based programs for Nim and checkers in the 1950s and reaching, by 2025, real-time generative world models that synthesise playable 3D environments from a text prompt.
This article uses "game AI" in two distinct senses that the games industry has historically kept separate. Inside a shipped game, AI usually means scripted opponents and companions built with finite state machines, behavior trees, or planners such as GOAP. Inside an AI lab, game AI usually means a reinforcement learning agent trained to win Go, StarCraft II, or Dota 2. Since 2022, generative AI has begun to merge these two strands by writing dialogue, generating art, and even rendering whole games end to end with a neural network.
Games have served as benchmarks for AI research since Arthur Samuel's checkers program at IBM in 1959, because they provide closed worlds with clear win conditions and easy ways to compare programs against humans. Each headline result of the past three decades, Deep Blue in 1997, AlphaGo in 2016, OpenAI Five in 2019, AlphaStar in 2019, used a game to demonstrate a broader capability: brute force search, deep reinforcement learning, long-horizon team coordination, real-time partial-observability planning.
Meanwhile the games industry itself has used AI techniques in shipped products since at least 1980, when Toru Iwatani and Toshio Kai gave each Pac-Man ghost a finite state machine that switched between chase, scatter, and frightened modes. The same basic toolkit, FSMs, A* pathfinding, navmeshes, and behavior trees, still drives most shipped non-player characters in 2025. What has changed is that generative AI now sits alongside that toolkit, generating concept art, environment props, voice barks, and entire training environments for agents.
The industry has not absorbed those changes without conflict. The SAG-AFTRA video game performers strike of 2024 to 2025 lasted eleven months and was driven almost entirely by AI replica concerns. Voice actors, environment artists, and writers have all raised the same question: how much of game development can be automated before the craft itself is hollowed out?
The first computer programs to play games were direct expressions of game rules. Alan Turing wrote a paper chess program by hand in 1948 that he and David Champernowne called Turochamp. Arthur Samuel started work on a checkers program at IBM in 1952 that learned a value function by self-play; the program is often cited as one of the earliest examples of machine learning. Chinook, developed at the University of Alberta by Jonathan Schaeffer and colleagues starting in 1989, beat world champion Marion Tinsley over a series of matches in the 1990s and was used in 2007 to weakly solve checkers, proving that perfect play by both sides ends in a draw.
The headline event of search-era AI was IBM's Deep Blue match against Garry Kasparov in May 1997, played in New York City. Deep Blue won the rematch 3.5 to 2.5, becoming the first computer to defeat a reigning world chess champion in a classical match. Its core was massive parallel alpha-beta search backed by custom chess chips, not learning.
Inside actual games, the technology stayed simpler. Pac-Man's four ghosts, designed in 1980, used a small FSM to switch between chase, scatter, and frightened states; each ghost had its own target-selection rule, giving the impression of distinct personalities. Real-time strategy games of the late 1990s and early 2000s, including LucasArts's Star Wars: Galactic Battlegrounds (2001), used rule-based scripting languages where developers wrote (defrule (fact)(action)) triggers and tuned strategic numbers to bias the AI toward economy or military builds.
By the mid-2000s, FSM scripts had become hard to maintain. Two design patterns replaced them.
Damian Isla, working on Halo 2 at Bungie, introduced the modern game-industry behavior tree in 2004. He presented the system at the Game Developers Conference (GDC) in 2005, describing how the team replaced a brittle FSM with a hierarchy of selector, sequence, and decorator nodes that could be edited by designers and evaluated reactively each tick. Behavior trees spread quickly through AAA development and remain standard in 2025.
Monolith's F.E.A.R. (2005) took a different route. AI lead Jeff Orkin used Goal-Oriented Action Planning (GOAP), an adaptation of the 1971 STRIPS planner, to give the game's Replica soldiers a small three-state FSM combined with A* search over actions. The result was enemies that flanked, suppressed, and retreated as emergent consequences of plans rather than scripted patterns. F.E.A.R.'s combat AI is still cited as one of the high-water marks of shipped game AI.
Monte Carlo tree search (MCTS) was introduced by Remi Coulom in 2006 in his paper "Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search" and used in his Go program Crazy Stone. MCTS combines random simulation with selective tree expansion and works without a strong heuristic, which is why it became the dominant approach in computer Go between 2006 and 2015.
Deep reinforcement learning entered games in February 2015 when DeepMind published its Nature paper showing a single deep Q-network (DQN) that learned to play 49 Atari 2600 games from raw pixels. The same architecture, with no game-specific tuning, beat human scores on the majority of the games and lost badly on a few that required long-horizon planning, foreshadowing later work on Montezuma's Revenge.
From there the field accelerated. The next sections cover the marquee game-playing systems in detail.
The table below collects the systems most often cited as proving AI superhuman at a major game. Dates refer to the publication or demonstration of the result, not the start of the project.
| System | Game | Year | Milestone | Lab |
|---|---|---|---|---|
| Chinook | Checkers | 1994, solved 2007 | Beat world champion Marion Tinsley; checkers weakly solved | University of Alberta |
| Deep Blue | Chess | May 1997 | Defeated Garry Kasparov 3.5 to 2.5 in New York | IBM |
| AlphaGo | Go | March 2016 | Beat Lee Sedol 4 to 1 in Seoul | DeepMind |
| AlphaGo Zero | Go | October 2017 | Reached superhuman play from self-play alone, no human games | DeepMind |
| AlphaZero | Chess, Shogi, Go | December 2017 preprint | One algorithm beat Stockfish 8 in chess, Elmo in shogi | DeepMind |
| OpenAI Five | Dota 2 | April 2019 | Beat OG (TI8 champions) 2 to 0 in San Francisco | OpenAI |
| Pluribus | Six-player no-limit Texas hold'em | July 2019 | Beat 13 poker pros over 10,000 hands | Meta AI, CMU |
| AlphaStar | StarCraft II | October 2019 | Reached Grandmaster (top 0.15%) on Battle.net European server | DeepMind |
| MuZero | Atari, Go, chess, shogi | November 2019 preprint | Learned without being told the rules | DeepMind |
| Cicero | Diplomacy | November 2022 | Top 10% of human players on webDiplomacy.net, won an 8-game tournament | Meta AI |
DeepMind's AlphaGo played a five-game match against Lee Sedol in Seoul from March 9 to 15, 2016 and won 4 to 1. The match was watched live by more than 200 million viewers in Asia and is widely treated as the moment Go fell to machines. The system combined a deep policy network trained on human games, a value network trained on self-play, and Monte Carlo tree search. The Korea Baduk Association awarded AlphaGo an honorary 9 dan grandmaster certificate.
Move 37 of game two, a shoulder hit on the fifth line, was flagged by human commentators as a probable mistake and later identified as one of the moves that gave Sedol the most trouble. Sedol won game four with his own move 78, a creative wedge that AlphaGo's networks had assigned very low probability. He retired from professional Go in 2019, citing the rise of AI as a factor.
AlphaGo Zero, published in Nature in October 2017, dropped human game data entirely and trained from random play through self-play reinforcement learning. It surpassed the strength of the Lee Sedol version in three days. AlphaZero, released as a preprint on December 5, 2017 and published in Science on December 7, 2018, generalized the approach to chess and shogi and was used to crush Stockfish 8 in a 100-game match (28 wins, 72 draws, 0 losses).
MuZero, preprinted in November 2019, removed even the requirement that the AI know the rules of the game. It learned a latent dynamics model jointly with value and policy networks, matching AlphaZero on board games and reaching state-of-the-art Atari scores.
OpenAI Five was a team of five LSTM-based agents trained through self-play to play 1v1 and later 5v5 Dota 2. The bots used Proximal Policy Optimization (PPO) at a scale of about 128,000 CPU cores and 256 GPUs for several months, accumulating roughly 180 years of game experience per day.
On April 13, 2019, OpenAI Five played The International 2018 champions OG in a best of three in San Francisco and won 2 to 0. It was the first AI system to defeat a world-champion esports team. After the demonstration OpenAI ran the bots online against the public from April 18 to 21, 2019; they played 7,257 games against 3,193 teams and won 99.4% of them. Restrictions still applied: a limited hero pool of 17 heroes, no illusions or summons, and instant rather than human-paced item swaps.
AlphaStar was DeepMind's StarCraft II agent, published in Nature on October 30, 2019. It was the first AI to reach Grandmaster level in a major esport while playing under conditions similar to human pros, including camera-based observation and capped action rates. AlphaStar reached Grandmaster on the European server for all three races, Protoss, Terran, and Zerg, ranking above 99.8% of active players.
The team trained AlphaStar using a population-based league of agents, where main agents played against exploiters that found and abused weaknesses in their strategies, similar to a fictional ladder of opponents. The resulting agents played a recognizably StarCraft style rather than purely abusing micromanagement, although their effective actions-per-minute were still extremely high in burst.
Pluribus, developed by Noam Brown and Tuomas Sandholm at Meta AI and Carnegie Mellon, was published in Science on July 11, 2019. It beat thirteen elite poker professionals at six-player no-limit Texas hold'em, including World Series of Poker champion Chris "Jesus" Ferguson and World Poker Tour record holder Darren Elias. Pluribus computed its blueprint strategy in eight days on 12,400 core hours and ran live on just 28 cores, making it the most resource-efficient superhuman game AI to that point.
Cicero, also from Meta AI, was published in Science on November 22, 2022. Cicero played Diplomacy, a seven-player game in which open natural-language negotiation is required to coordinate moves. The system combined a 2.7-billion-parameter language model with a strategic reasoning module that selected dialogue actions consistent with its planned moves. In anonymous play on webDiplomacy.net, Cicero placed in the top 10% of players over forty games and won an eight-game tournament.
Cicero is interesting because Diplomacy requires what philosophers call theory of mind: modelling what other players believe and want. The Cicero paper makes clear that the system was designed to be honest in its commitments because lying in Diplomacy is usually a bad long-term strategy.
Procedural content generation (PCG) predates modern AI by decades. Rogue (1980) generated dungeon layouts at runtime, and Elite (1984) used a deterministic algorithm to pack eight galaxies of 256 star systems into 22 kilobytes of memory. The point of PCG was usually compression and replayability, not artificial intelligence.
Three more recent landmarks have brought PCG into mainstream attention.
Spelunky, released in 2008 by Derek Yu, generated each level by selecting and connecting prefabricated room templates, then guaranteeing a path to the exit by carving through unbreakable walls if needed. The system has been studied at GDC and copied widely. Spelunky won the IGF Excellence in Design award in 2012.
No Man's Sky, released by Hello Games on August 9, 2016 on PlayStation 4 in North America (August 10 in Europe, August 12 on PC), used deterministic procedural generation to populate roughly 18 quintillion planets, each with their own terrain, flora, and fauna. The launch was famous for the gap between the demos and the shipped product; the game has been substantially expanded with later updates.
Wave function collapse (WFC), introduced by Helsinki developer Maxim Gumin in a public GitHub repository in 2016, generates tilemaps and bitmaps from a single example image. The algorithm works by treating each tile as being in a superposition of possible values and "collapsing" each one to a definite value subject to local constraints, an analogy to the quantum-mechanics concept that gave the algorithm its name. WFC has shipped in Bad North, Townscaper, and Caves of Qud, and has been ported to most major engines.
Modern PCG research increasingly uses machine learning. Papers presented at the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) regularly cover learned generators, mixed-initiative tools that suggest level edits to designers, and generators based on diffusion or GANs.
Between 2022 and 2025, the games industry built a layer of generative AI tools on top of older techniques. The most cited ones are listed below.
| Tool or company | Function | Notable users or partners |
|---|---|---|
| Promethean AI | AI assistant for environment artists, populates 3D scenes from natural-language prompts | Founded by a former art director at Naughty Dog |
| Scenario | Custom generative models for style-consistent 2D and 3D game art | Funded by Brendan Iribe, Justin Kan; used by Ubisoft for character variants |
| Inworld AI | Character engine for generative NPC dialogue, memory, and behavior | Integrated with Unreal Engine, Unity; partners include NetEase and Niantic |
| Convai | NPC conversation platform with speech in and out | Integrated with NVIDIA ACE |
| NVIDIA ACE | Avatar Cloud Engine: ASR, LLM, TTS, and Audio2Face microservices | Convai, Inworld, Charisma.AI, miHoYo, Ubisoft |
| Ubisoft Ghostwriter | In-house tool for generating first-draft NPC barks | Ubisoft scriptwriters |
| Ludo.ai | Game ideation, design document drafting, prototype generation | Indie developers |
| Stable Diffusion and forks | Concept art and texture generation | Widely used by indie studios, often without public credit |
Promethean AI is positioned as an assistant for environment artists rather than a replacement. The user describes a scene, for example "a sci-fi lab with a workbench and scattered tools", and the system proposes a layout drawn from the artist's own asset library. It markets a roughly tenfold speedup on dressing repetitive environments.
Scenario, founded in 2022 and given a $6 million seed round in January 2023, lets a studio fine-tune a generative model on its own art to keep style consistent. Investors included Oculus co-founder Brendan Iribe and Twitch co-founder Justin Kan. Ubisoft used Scenario to produce more than ten thousand character variants for one project.
NVIDIA introduced the Avatar Cloud Engine (ACE) for Games at Computex 2023 and demoed live conversation with an NPC in a noodle shop using ASR (Riva), an LLM, TTS, and Audio2Face for lip sync. ACE is delivered as cloud microservices and pairs with third-party NPC platforms.
Inworld is the best-funded of those NPC platforms, having raised more than $50 million at a $500 million-plus valuation by 2023. Its Character Engine has three layers: a Character Brain that orchestrates a personality through ML models, a Contextual Mesh that constrains the LLM to the world's lore to limit hallucinations, and an integration layer for Unreal Engine and Unity. NetEase's Cygnus Enterprises and Niantic's Wol have shipped with Inworld characters. Convai is the closest competitor, also tightly integrated with NVIDIA ACE.
Ubisoft announced Ghostwriter at GDC 2023. The tool, built by Ubisoft's R&D group La Forge and integrated into the company's narrative tool Omen, drafts the short non-quest lines that NPCs say in passing, called barks. A writer types a context and the tool returns two phrasings; the writer picks one and Ghostwriter trains on the choice. Ubisoft was careful to limit it to barks and UI strings, not lore or cinematics, but the announcement still drew sharp criticism from working game writers who saw it as the leading edge of pressure to reduce headcount.
A world model is a neural network that learns to predict future frames of a game conditioned on a player's actions, in effect approximating the game engine itself. Three world models released between 2024 and 2025 changed the conversation about what AI could do for games.
DeepMind's first Genie paper appeared on arXiv on February 23, 2024. The 11-billion-parameter model was trained on roughly 30,000 hours of filtered 2D platformer footage and learned a latent action space without any action labels, so users could navigate a generated world with eight discrete actions even though the model had never been told what those actions corresponded to.
Genie 2, announced December 4, 2024, extended the approach to 3D. From a single prompt image, often generated by Google's Imagen 3, Genie 2 produces a playable environment that holds up for ten to twenty seconds in most clips and up to a minute in good cases. DeepMind reported that Genie 2 remembers parts of the scene that leave the viewport and renders them consistently when they come back, an emergent property that the team did not explicitly train.
Genie 3, announced August 5, 2025, is the first real-time interactive general-purpose world model. It runs at 24 frames per second at 720p and maintains physical and visual consistency for a few minutes of play. DeepMind framed Genie 3 as a stepping stone toward AGI because consistent, navigable simulation is exactly the substrate that embodied agents need for training. Genie 3 was kept in research preview rather than released to the public.
Oasis, a joint project between Israeli startup Decart and chip startup Etched, was released to the public on October 31, 2024. It is a generative Minecraft clone: there is no game engine and no Minecraft code in it at all. Instead, a transformer takes the player's keyboard and mouse inputs and predicts the next frame at about twenty frames per second, trained on millions of hours of Minecraft footage. The full 500-million-parameter weights were released alongside a hosted demo of a larger checkpoint.
Oasis has clear limitations. Without memory or explicit state, the world drifts when the player turns around or looks away; trees vanish, inventory icons change, and rooms rearrange. Even so, it is the first publicly playable demonstration that a foundation model can stand in for an entire game engine.
Microsoft Research announced Muse on February 19, 2025. Built in partnership with Ninja Theory, an Xbox Game Studios team, Muse is a World and Human Action Model (WHAM) trained on more than a billion images and controller actions from Bleeding Edge, Ninja Theory's 2020 multiplayer game, corresponding to about seven years of continuous human play. Microsoft positioned Muse as a gameplay ideation tool for developers, not a consumer product, and open-sourced the WHAM-1.6B weights, sample data, and a demonstrator. Microsoft also raised the possibility that Muse-style models could let older Xbox titles run on new hardware without traditional emulation.
Fei-Fei Li's World Labs came out of stealth in September 2024 with $230 million in funding and a public mission to build spatially intelligent world models. Its first product, Marble, opened in limited beta in September 2025 and launched broadly on November 12, 2025. Unlike Genie or Oasis, Marble outputs persistent, downloadable 3D scenes rather than generating frames on the fly. Users supply text, images, video, or coarse 3D layouts, and the model fills in the geometry and textures. Marble is positioned for game artists, VFX houses, and architects, not for real-time play.
AI Dungeon was built in 2019 by Nick Walton, then a deep learning researcher at Brigham Young University, during a hackathon. The first version ran on GPT-2 and was released in May 2019. AI Dungeon 2, released in December 2019 on Google Colaboratory, was the breakout: text-based open-ended fantasy adventures generated by a language model that would happily take any input the player typed.
Walton spun the project into a company, Latitude, in 2020. A premium tier called Dragon, which used OpenAI's GPT-3 API, launched in July 2020. Latitude raised more than $4 million in early funding and at one point reported more than a million monthly active users.
AI Dungeon's defining moment came in April 2021. OpenAI flagged that some content generated through the game depicted sexual abuse of minors, including content produced by the model itself in response to non-sexual prompts. Latitude introduced new automated filtering. The filtering system flagged not only explicit prompts but private stories that contained completely benign uses of words like "eight-year-old", which players discovered when they were locked out of their accounts. Soon after, players also learned that Latitude employees could view flagged stories, an issue that triggered a privacy backlash. Walton apologized and the team reworked the moderation system over the following months. The incident remains a much-cited case study in content moderation on generative platforms.
AI Dungeon also opened the broader category of interactive fiction powered by LLMs. Competitors and successors include NovelAI, Kobold AI, KoboldAI Lite, and a steady drip of self-hosted projects running open-weight models from Meta, Mistral, and others.
Machine learning has been quietly central to live ops for at least a decade. Most major multiplayer titles run server-side anomaly detection to flag aimbots, wallhacks, and macro bots, generally as a layer on top of binary signature checks. The visible names in this space are kernel-level anti-cheats and ML-based voice moderators.
Riot Vanguard launched alongside Valorant on April 7, 2020. It runs as a Windows kernel-mode driver that starts at boot and a user-mode client that runs while a Riot game is open. Vanguard has been criticized by security researchers for the surface area it presents, since a flawed kernel driver is a vector for privilege escalation, but Riot has defended the design as necessary to keep pace with cheats that themselves operate at kernel level. Vanguard was extended to League of Legends in 2024.
BattlEye, founded in 2004, takes a similar approach and ships in PUBG, Fortnite on PC, DayZ, and many Easy Anti-Cheat competitors. Both vendors maintain ML-based behavioral detection in addition to signature-based scans.
For voice chat, Activision announced a partnership with Modulate on August 30, 2023 to deploy ToxMod across Call of Duty: Modern Warfare II and Call of Duty: Warzone, with a wider rollout aligned to the launch of Modern Warfare III on November 10, 2023. ToxMod uses ML to classify in-game voice in near real time, distinguishing harassment from competitive trash talk by reading conversational context and listener reactions rather than scanning for keywords. Activision reported a roughly 50% reduction in players exposed to severe disruptive voice chat after Modern Warfare III's launch.
On July 26, 2024, SAG-AFTRA declared a strike against the video game industry under its Interactive Media Agreement. The dispute had been building for more than eighteen months and centered almost entirely on AI. The union wanted consent and disclosure requirements before any performer's voice or motion-capture performance could be used to train an AI replica, plus the right to suspend that consent during a future strike.
The strike lasted just under eleven months. SAG-AFTRA reached a tentative agreement with the major studios on June 9, 2025 and suspended the strike on June 11. Members ratified the new 2025 Interactive Media Agreement on July 9, 2025 with 95.04% of the votes in favor.
The ratified contract includes:
The strike was preceded by a separate and controversial side agreement. On January 9, 2024 SAG-AFTRA had signed a deal with Replica Studios at CES to license union voices for AI replica training. Many prominent voice actors, including names well known from Genshin Impact and other major titles, said publicly they were not consulted and felt blindsided. Critics argued the Replica deal normalized digital replicas in the industry before the main contract had been negotiated. SAG-AFTRA also signed a separate August 2024 deal with Narrativ for ad voice replicas; the strike against game publishers continued throughout.
A running list of generative AI use in shipped or announced game projects, drawn from public reporting.
| Company or game | Use of AI | Year |
|---|---|---|
| Microsoft Flight Simulator | Procedural world generation using Bing satellite data and ML upscaling | 2020 |
| Hello Games, No Man's Sky | Procedural planets, deterministic generator written before the deep learning era | 2016 |
| Ubisoft, multiple projects | Ghostwriter for bark drafting, Scenario for character variants | 2023 onward |
| NVIDIA ACE demos with Convai | Real-time conversational NPCs at Computex and CES | 2023 onward |
| Riot Games | Vanguard kernel anti-cheat, ML-based behavioral detection | 2020 onward |
| Activision Blizzard | ToxMod voice moderation in Call of Duty | 2023 onward |
| Square Enix | Public commitment to generative AI in Foamstars and beyond | 2024 |
| Krafton | AI-driven NPC "smart agent" demo in inZOI | 2024 |
| Embark Studios, The Finals | AI-generated voice lines for commentators | 2023 |
The Finals, released by Embark Studios in December 2023, used AI-generated voiceovers for its match commentators and was widely criticized. Embark argued that the alternative would have been no commentary at all due to budget; voice actors responded that this rationale would be used to justify every future cost cut. The argument has rerun several times in 2024 and 2025 as other studios shipped AI-driven voice content during the strike.
Generative AI in games has produced a steady stream of disputes that go beyond the SAG-AFTRA strike.
Voice cloning of named performers, especially of Genshin Impact characters and other anime-style voices, became a cottage industry on community sites in 2023 and 2024. Performers sometimes discovered their voices being used to generate explicit content or political messaging without consent. The industry response has been a mix of takedown campaigns, watermarking efforts, and contractual replica clauses, none of which fully solve the underlying technical problem.
Generative art on storefronts has triggered backlash on Steam, where Valve initially refused submissions that included AI-generated art over copyright concerns, then in January 2024 changed its policy to allow them with disclosure. Modders and artists have complained about being scraped without consent by general image-generation models like Stable Diffusion and Midjourney.
Layoffs at large game publishers during 2023 and 2024 were widely covered as a backdrop to AI announcements, especially after Microsoft's acquisition of Activision Blizzard. When Microsoft demoed Muse alongside the announcement that it had laid off Xbox staff, the optics were poor and the company spent significant communications energy clarifying that Muse was a research tool rather than a layoff vehicle. Whether AI is a cause, a symptom, or a coincidence of game industry layoffs is genuinely unclear; the layoffs themselves are not.
Kernel-level anti-cheat is the longest-running cultural controversy. Vanguard's always-on driver has been criticized by privacy advocates and security researchers as overreach, and at least three separate incidents have seen Vanguard interact badly with hardware like motherboard fan controllers or specific peripheral drivers. Riot has stood by the design and shipped patches when individual incidents have come up.
Finally, the entire "superhuman game AI" narrative has its critics. AlphaStar's effective actions-per-minute in burst were sometimes much higher than human pros could sustain, even after rate-limiting. OpenAI Five played a heavily constrained Dota 2 with a 17-hero pool and no illusions. Cicero's strong play on webDiplomacy.net came against humans who were typically not the top tier of competitive Diplomacy. None of these caveats overturn the milestones, but they all complicate the simple story of machine surpassing man.