Universe
Last reviewed
May 10, 2026
Sources
17 citations
Review status
Source-backed
Revision
v2 ยท 2,482 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 10, 2026
Sources
17 citations
Review status
Source-backed
Revision
v2 ยท 2,482 words
Add missing citations, update stale details, or suggest a clearer explanation.
Universe is an open-source software platform developed by OpenAI for measuring and training an artificial intelligence (AI) general intelligence, also called strong AI. [1] [2] [3] It is a middleware program that supports OpenAI's Gym, a toolkit for the development and evaluation of reinforcement learning (RL) algorithms. [2] [3] Games and websites are used to train the AI. Any task that a person can solve on a computer is theoretically a viable option for training, with researchers able to plug any application into Universe so AI agents have a common way of interacting with the applications. [2] [4]
The software environments are instantiated in Docker containers, with AI agents interacting through a virtual keyboard and mouse using a Virtual Network Computing (VNC) remote desktop. The more interaction the agents have with the environment, the better they become at a specific task. [3]
OpenAI gave an example of an agent trained through RL on Slither.io, where the player avoids collision with other snakes. After six days of training, the agent scored "an average of 1,000 points, with a high score of 9,300 points. As a point of comparison, OpenAI machine-learning researcher Rafal Jozefowicz, with five hours of playing experience, averaged about 1,400 points, with a high score of 7,050." [3]
Universe has been compared to ImageNet, a hand-labeled image database used to test image recognition systems. In Universe, images are substituted by flash games, web browsers, photo editors, and CAD software. [5] On release the platform shipped with about 2,500 Atari games, 1,000 flash games, and 80 browser environments, described by OpenAI as the largest single library of reinforcement learning environments at the time. [1] [5]
The project has since been abandoned by OpenAI in favor of Gym Retro. The public GitHub repository was archived on April 6, 2018, with a deprecation notice pointing readers to the Retro library. Several upcoming developments described in the Universe launch blog post were never released, including environment integration tools so any user could contribute new integrations, and the public release of human demonstration data. [6] [9]
Universe was unveiled on December 5, 2016, the same week as the Conference on Neural Information Processing Systems (NeurIPS, then known as NIPS) in Barcelona. [1] [2] [10] OpenAI had launched Gym in April of the same year as a more limited RL toolkit, and Universe was positioned as the next step: a way to wrap arbitrary off-the-shelf software into Gym-compatible environments without requiring source code or internal APIs. [1] [10] The rollout was tied to the NIPS schedule, with the project discussed by co-founders Greg Brockman and Ilya Sutskever alongside the company's other reinforcement learning work. [10]
Progress slowed during 2017. In April 2018, OpenAI released a follow-up project called Gym Retro, which integrated emulated Sega Genesis and other classic console games into the Gym interface using direct memory access rather than VNC. The same month, the Universe repository was archived on GitHub, with the README updated to recommend Retro for new work. [9] [11] In retrospective notes about Retro, OpenAI acknowledged that Universe "didn't work out for us" because its environments ran asynchronously, could only run in real time, and were often unreliable due to screen-based detection of game state. [11]
The table below sketches the project's main milestones.
| Date | Event |
|---|---|
| April 2016 | OpenAI releases Gym, the toolkit Universe later builds on. [10] |
| December 5, 2016 | Universe is announced on the OpenAI blog with around 1,000 environments at launch. [1] [2] |
| December 2016 | Universe is presented during NIPS 2016 in Barcelona. [10] |
| 2017 | OpenAI and Stanford researchers publish "World of Bits" at ICML 2017, expanding the browser benchmark seeded with Universe. [12] |
| April 6, 2018 | The openai/universe GitHub repository is archived; users are redirected to Gym Retro. [9] [11] |
According to OpenAI, the goal of the project was to "develop a single AI agent that can flexibly apply its past experience on Universe environments to quickly master unfamiliar, difficult environments, which would be a major step towards general intelligence." Ilya Sutskever, an OpenAI researcher, said "an AI should be able to solve any problem you throw at it." [1] [4]
By expanding the number of training resources, OpenAI expected the education of AI agents to accelerate. Before Universe, the largest reinforcement learning resource of comparable design was the Arcade Learning Environment, which included 55 Atari games. [2] [3] Universe set out to push that number into the thousands by absorbing whole categories of software previously considered too messy for RL: web browsers, photo editors, CAD tools, and games with no programmatic interface.
On release, Universe shipped with the largest library of games and resources ever assembled for RL, including 1,000 flash games distributed in a Docker image, games like Slither.io and StarCraft, browser-based tasks, and applications like form filling and Foldit. [1] [7] According to OpenAI, flash games were a starting point for scaling because they are pervasive on the internet, usually with better graphics than Atari titles but still simple enough for early agents. [1] OpenAI also noted that with environments running asynchronously inside the Docker image with a local network in the cloud, games usually ran at 60 frames per second, while over public internet this dropped to about 20 frames per second. [1]
Universe allows a user to train and evaluate AI agents, with the AI using a computer like a human would. This provides a wide range of real-time and complex environments. [1] The platform lets any program become an OpenAI Gym environment without needing special access to the program's internals, source code, or APIs. According to OpenAI's GitHub, it "does this by packaging the program into a Docker container, and presenting the AI with the same interface a human uses: sending keyboard and mouse events, and receiving screen pixels." [7] [8]
The AI agent explores environments visually, observing pixels on a screen and inputting keyboard and mouse commands. [1] [2] This interface is implemented using the VNC program for remote desktop access. [2] Internally a Universe session has two halves: a Python client (a VNCEnv instance) running inside the agent's process, and a remote (a Docker container running the actual environment dynamics). The two communicate over VNC for pixels and keyboard/mouse events, and over a separate WebSocket channel for rewards, episode boundaries, and diagnostics. [9]
Games provide the feedback loop necessary for the constant improvement of AI skills, gathering experience in small tasks and resolving new ones faster. [2] Ideally, the agent would surpass its specialized knowledge of a specific environment, aiming at more generalized intelligence. [7] [8] Reward functions are integral to RL: in many games, there is an on-screen score that can be used as a reward, and for some flash games, OpenAI shipped reward extractors that scraped pixel regions for that score. [8]
Besides the game environments, Universe includes browser-based navigation where the agent can interact with the web like people do, learning how to use elements like buttons, lists, and sliders. [1] OpenAI developed a benchmark called Mini World of Bits to understand the challenges of browser interactions in a simplified setting. It consists of 80 environments that range from simple tasks like clicking a button to difficult ones like replying to a contact in a simulated email client; OpenAI believed "that mastering these environments provides valuable signal towards models and training techniques that will perform well on full websites and more complex tasks." [1] [5]
The Universe Python client was supported on Linux and macOS, with Python 2.7 and Python 3.5 as the supported interpreters. The build chain depended on Go 1.5 or newer, NumPy, libjpeg-turbo, and a working Docker installation. There was no official Windows support; Windows users were directed to run the client through a Linux virtual machine. [9]
During the implementation of this platform, OpenAI emphasized four design properties:
A notable feature of the launch was the list of game and software publishers that granted OpenAI permission for Universe agents to play their commercial titles. The headline partners were EA (Electronic Arts), Microsoft Studios, Valve, and Wolfram Research, with smaller indie studios contributing additional titles. [1] [13] [14] The table below lists representative titles named in the launch announcement and contemporary press coverage.
| Software | Publisher / origin | Type |
|---|---|---|
| Portal | Valve | First-person puzzle |
| Wing Commander III | EA | Space combat |
| Command & Conquer: Red Alert 2 | EA | Real-time strategy |
| Sid Meier's Alpha Centauri | EA | Turn-based strategy |
| Magic Carpet | EA (Bullfrog) | First-person shooter |
| Mirror's Edge | EA | First-person platformer |
| Syndicate (1993) | EA (Bullfrog) | Real-time tactics |
| Fable Anniversary | Microsoft Studios | Action role-playing |
| World of Goo | 2D Boy | Physics puzzle |
| RimWorld | Ludeon Studios | Colony simulation |
| Slime Rancher | Monomi Park | Life simulation |
| Shovel Knight | Yacht Club Games | 2D platformer |
| SpaceChem | Zachtronics | Puzzle |
| Wolfram Mathematica | Wolfram Research | Computer algebra |
| Slither.io | Steve Howse | Browser game |
| StarCraft | Blizzard Entertainment | Real-time strategy |
| Grand Theft Auto V | Rockstar Games | Open world (community integration) |
| Foldit | University of Washington | Protein folding game |
Grand Theft Auto V appeared throughout press coverage, although the actual integration was developed in parallel by Craig Quiter with NVIDIA and was not part of the initial release. [1] [9] [13] OpenAI also discussed plans to connect Universe with Microsoft Research's Project Malmo, a Minecraft-based AI sandbox, although that crossover did not become a maintained integration. [13]
Coverage of the launch was generally enthusiastic, treating Universe as a step beyond Atari benchmarks toward a more realistic test bed for general intelligence. The Register described it as a "universal training ground for computers," PCWorld and ITPro framed it as a way to teach AI to use software the way humans do, and SD Times noted that the platform was a clear continuation of Gym in a more open-ended direction. [2] [13] [14] [15] Michael Bowling of the University of Alberta, who had worked on the Arcade Learning Environment, told Futurism that the breadth of Universe was useful as long as researchers remembered that games are a means rather than an end. [16]
The enthusiasm did not last. By mid-2017, GitHub issues on the Universe repository were noting that pull requests were sitting unreviewed, that several integrations were broken on recent Docker releases, and that the rate of new content had slowed sharply. [9] In April 2018 OpenAI shipped Gym Retro and ran the Retro Contest, a transfer learning competition centered on Sonic the Hedgehog. The Retro launch posts argued that VNC-based remote desktops were not a good substrate for reinforcement learning: the agent could not run faster than wall-clock time, the screen-scraped state was noisy, and the integration overhead of every new title was high. [11]
Even though the Universe codebase itself was abandoned, the ideas it tested have had a long afterlife. The Mini World of Bits subset became the seed for the World of Bits paper presented at ICML 2017 by Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, and Percy Liang. The paper introduced an open-domain platform for web-based agents, with crowdworkers writing natural language tasks and demonstrations on real websites, and HTTP traffic cached so the tasks could be replayed offline. [12] That benchmark was later cleaned up and extended by Stanford researchers as MiniWoB++, with more than 100 web interaction tasks, and by 2022 had become a standard reference for browser-based LLM agents. [12] [17]
More recent web-agent benchmarks acknowledge Universe and MiniWoB as predecessors. WebArena, released in 2023, builds a self-hosted set of realistic websites for agents to navigate, citing Mini World of Bits as a foundational simplified benchmark. The same lineage runs through Mind2Web, VisualWebArena, and various "computer use" agents built on large language models, all of which can be read as successors to the basic Universe idea: give an agent pixels and a keyboard, then ask it to operate real software.
Universe also influenced OpenAI's own internal direction. The lessons about asynchronous execution and pixel-based state estimation pushed the company toward emulator-backed environments in Gym Retro and simulators with controlled tick rates in projects like the OpenAI Five Dota 2 effort. The goal of an agent that can flexibly apply prior experience to unfamiliar software has since become more reachable through large multimodal models rather than the from-scratch reinforcement learning Universe was designed for, and recent agent products for browsing and computer use trace a clear genealogy back to this 2016 platform.