Universe

Artificial Intelligence

15 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

18 citations

Revision

v3 · 2,899 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Universe was an open-source software platform that OpenAI released on December 5, 2016 for measuring and training an artificial intelligence agent's general intelligence across, in OpenAI's words, "the world's supply of games, websites and other applications." ^[1] ^[7] Its initial release shipped with over 1,000 environments in which an agent could take actions and gather observations, and it let any program become an OpenAI Gym environment without access to that program's source code or internal APIs. ^[1] ^[7] OpenAI archived the project on April 6, 2018 and pointed users to Gym Retro instead, but Universe is widely regarded as an early ancestor of today's browser-using and "computer use" AI agents. ^[9] ^[11]

Universe is a middleware program that builds on OpenAI's Gym, a toolkit for the development and evaluation of reinforcement learning (RL) algorithms. ^[2] ^[3] Games and websites are used to train the agent. Any task that a person can solve on a computer is theoretically a viable option for training, with researchers able to plug an application into Universe so AI agents have a common way of interacting with it. ^[2] ^[4]

The software environments are instantiated in Docker containers, with AI agents interacting through a virtual keyboard and mouse using a Virtual Network Computing (VNC) remote desktop. The more interaction the agents have with the environment, the better they become at a specific task. ^[3]

What is Universe used for?

Universe lets a user train and evaluate AI agents, with the agent using a computer the way a human would. This provides a wide range of real-time and complex environments. ^[1] The platform turns any program into an OpenAI Gym environment without needing special access to the program's internals, source code, or APIs. According to OpenAI's GitHub README, Universe "does this by packaging the program into a Docker container, and presenting the AI with the same interface a human uses: sending keyboard and mouse events, and receiving screen pixels." ^[7] ^[8]

The AI agent explores environments visually, observing pixels on a screen and inputting keyboard and mouse commands. ^[1] ^[2] This interface is implemented using the VNC program for remote desktop access. ^[2] Internally a Universe session has two halves: a Python client (a VNCEnv instance) running inside the agent's process, and a remote (a Docker container running the actual environment dynamics). The two communicate over VNC for pixels and keyboard/mouse events, and over a separate WebSocket channel for rewards, episode boundaries, and diagnostics. ^[7] ^[9]

Games provide the feedback loop necessary for the constant improvement of AI skills, gathering experience in small tasks and resolving new ones faster. ^[2] Ideally, the agent would surpass its specialized knowledge of a specific environment, aiming at more generalized intelligence. ^[7] ^[8] Reward functions are integral to RL: in many games, there is an on-screen score that can be used as a reward, and OpenAI shipped a convolutional-neural-network OCR model, running inside the Docker container, that read those scores from the pixel buffer and relayed them to the agent as rewards. ^[1] ^[8]

Besides the game environments, Universe includes browser-based navigation where the agent can interact with the web like people do, learning how to use elements like buttons, lists, and sliders. ^[1] OpenAI developed a benchmark called Mini World of Bits to understand the challenges of browser interactions in a simplified setting. It consists of 80 environments that range from simple tasks like clicking a button to difficult ones like replying to a contact in a simulated email client; OpenAI believed "that mastering these environments provides valuable signal towards models and training techniques that will perform well on full websites and more complex tasks." ^[1] ^[5]

What were the system requirements?

The Universe Python client was supported on Linux and macOS, with Python 2.7 and Python 3.5 as the supported interpreters. The build chain depended on Go 1.5 or newer, NumPy, libjpeg-turbo, and a working Docker installation. There was no official Windows support; Windows users were directed to run the client through a Linux virtual machine. ^[7] ^[9]

How many environments did Universe ship with?

Universe has been compared to ImageNet, a hand-labeled image database used to test image recognition systems. In Universe, images are substituted by flash games, web browsers, photo editors, and CAD software. ^[5] On release the platform shipped with about 2,500 Atari games, 1,000 flash games, and 80 browser environments, described by OpenAI as the largest single library of reinforcement learning environments at the time. ^[1] ^[5] Of the 1,000 flash games, only about 100 came with built-in reward functions at launch. ^[1]

For scale, the table below compares Universe's launch library with the dominant RL benchmark that preceded it, the Arcade Learning Environment.

Environment family	Approximate count at launch	Notes
Atari games	~2,500	Drawn from the Atari lineage that the Arcade Learning Environment popularized ^[1] ^[5]
Flash games	1,000 (about 100 with rewards)	Distributed inside a Docker image; chosen as a starting point for scaling ^[1]
Browser environments	80	The Mini World of Bits benchmark of web tasks ^[1] ^[5]
Arcade Learning Environment (prior art)	55	The largest comparable RL resource before Universe ^[2] ^[3]

According to OpenAI, flash games were a starting point for scaling because they are pervasive on the internet, usually with better graphics than Atari titles but still simple enough for early agents. ^[1] OpenAI also noted that with environments running asynchronously inside the Docker image with a local network in the cloud, games usually ran at 60 frames per second, while over public internet this dropped to about 20 frames per second. ^[1]

What was the goal of Universe?

According to OpenAI, the goal of the project was to "develop a single AI agent that can flexibly apply its past experience on Universe environments to quickly master unfamiliar, difficult environments, which would be a major step towards general intelligence." Ilya Sutskever, an OpenAI researcher, said "an AI should be able to solve any problem you throw at it." ^[1] ^[4]

By expanding the number of training resources, OpenAI expected the education of AI agents to accelerate. Before Universe, the largest reinforcement learning resource of comparable design was the Arcade Learning Environment, which included 55 Atari games. ^[2] ^[3] Universe set out to push that number into the thousands by absorbing whole categories of software previously considered too messy for RL: web browsers, photo editors, CAD tools, and games with no programmatic interface.

On release, Universe shipped with the largest library of games and resources ever assembled for RL, including 1,000 flash games distributed in a Docker image, games like Slither.io and StarCraft, browser-based tasks, and applications like form filling and Foldit. ^[1] ^[7] As a worked example, OpenAI trained an agent through RL on Slither.io, where the player avoids collision with other snakes. After about six days of training, the agent scored "an average of 1,000 points, with a high score of 9,300 points. As a point of comparison, OpenAI machine-learning researcher Rafal Jozefowicz, with five hours of playing experience, averaged about 1,400 points, with a high score of 7,050." ^[3]

When was Universe released?

Universe was unveiled on December 5, 2016, the same week as the Conference on Neural Information Processing Systems (NeurIPS, then known as NIPS) in Barcelona. ^[1] ^[2] ^[10] OpenAI had launched Gym in April of the same year as a more limited RL toolkit, and Universe was positioned as the next step: a way to wrap arbitrary off-the-shelf software into Gym-compatible environments without requiring source code or internal APIs. ^[1] ^[10] The rollout was tied to the NIPS schedule, with the project discussed by co-founders Greg Brockman and Ilya Sutskever alongside the company's other reinforcement learning work. ^[10]

Progress slowed during 2017. In April 2018, OpenAI released a follow-up project called Gym Retro, which integrated emulated Sega Genesis and other classic console games into the Gym interface using direct memory access rather than VNC. The same month, the Universe repository was archived on GitHub, with the README updated to recommend Retro for new work. ^[9] ^[11] In retrospective notes about Retro, OpenAI acknowledged that it could not get good results from Universe because its environments "ran asynchronously, could only run in real time, and were often unreliable due to screen-based detection of game state." ^[11]

The table below sketches the project's main milestones.

Date	Event
April 2016	OpenAI releases Gym, the toolkit Universe later builds on. ^[10]
December 5, 2016	Universe is announced on the OpenAI blog with over 1,000 environments at launch. ^[1] ^[2]
December 2016	Universe is presented during NIPS 2016 in Barcelona. ^[10]
2017	OpenAI and Stanford researchers publish "World of Bits" at ICML 2017, expanding the browser benchmark seeded with Universe. ^[12]
April 6, 2018	The openai/universe GitHub repository is archived; users are redirected to Gym Retro. ^[9] ^[11]

What design properties did Universe emphasize?

During the implementation of this platform, OpenAI emphasized four design properties:

Generalization, in which an AI agent uses the human interface to interact with programs, allowing it to browse the web, play games, interact with a terminal, edit spreadsheets, or operate a photo editing program;
Familiarity, since agents interact with environments in a way usual for humans;
VNC as standard, since many VNC implementations are available online, letting humans give demonstrations without installing new software;
Easily debugged, since VNC traffic can be saved for analysis and an agent can be observed while in training or being evaluated. ^[1]

Which companies let Universe use their software?

A notable feature of the launch was the list of game and software publishers that granted OpenAI permission for Universe agents to play their commercial titles. The headline partners were EA (Electronic Arts), Microsoft Studios, Valve, and Wolfram Research, with smaller indie studios contributing additional titles. ^[1] ^[13] ^[14] The table below lists representative titles named in the launch announcement and contemporary press coverage.

Software	Publisher / origin	Type
Portal	Valve	First-person puzzle
Wing Commander III	EA	Space combat
Command & Conquer: Red Alert 2	EA	Real-time strategy
Sid Meier's Alpha Centauri	EA	Turn-based strategy
Magic Carpet	EA (Bullfrog)	First-person shooter
Mirror's Edge	EA	First-person platformer
Syndicate (1993)	EA (Bullfrog)	Real-time tactics
Fable Anniversary	Microsoft Studios	Action role-playing
World of Goo	2D Boy	Physics puzzle
RimWorld	Ludeon Studios	Colony simulation
Slime Rancher	Monomi Park	Life simulation
Shovel Knight	Yacht Club Games	2D platformer
SpaceChem	Zachtronics	Puzzle
Wolfram Mathematica	Wolfram Research	Computer algebra
Slither.io	Steve Howse	Browser game
StarCraft	Blizzard Entertainment	Real-time strategy
Grand Theft Auto V	Rockstar Games	Open world (community integration)
Foldit	University of Washington	Protein folding game

Grand Theft Auto V appeared throughout press coverage, although the actual integration was developed in parallel by Craig Quiter with NVIDIA and was not part of the initial release. ^[1] ^[9] ^[13] OpenAI also discussed plans to connect Universe with Microsoft Research's Project Malmo, a Minecraft-based AI sandbox, although that crossover did not become a maintained integration. ^[13]

How was Universe received?

Coverage of the launch was generally enthusiastic, treating Universe as a step beyond Atari benchmarks toward a more realistic test bed for general intelligence. The Register described it as a "universal training ground for computers," PCWorld and ITPro framed it as a way to teach AI to use software the way humans do, and SD Times noted that the platform was a clear continuation of Gym in a more open-ended direction. ^[2] ^[13] ^[14] ^[15] Michael Bowling of the University of Alberta, who had worked on the Arcade Learning Environment, told Futurism that the breadth of Universe was useful as long as researchers remembered that games are a means rather than an end. ^[16]

The enthusiasm did not last. By mid-2017, GitHub issues on the Universe repository were noting that pull requests were sitting unreviewed, that several integrations were broken on recent Docker releases, and that the rate of new content had slowed sharply. ^[9] In April 2018 OpenAI shipped Gym Retro and ran the Retro Contest, a transfer learning competition centered on Sonic the Hedgehog. The Retro launch posts argued that VNC-based remote desktops were not a good substrate for reinforcement learning: the agent could not run faster than wall-clock time, the screen-scraped state was noisy, and the integration overhead of every new title was high. ^[11]

Why was Universe abandoned?

The project has since been abandoned by OpenAI in favor of Gym Retro. The public GitHub repository was archived on April 6, 2018, with a deprecation notice stating that the "repository has been deprecated in favor of the Retro library," and an issue titled "This project is ABANDONED" confirmed that maintenance had stopped. ^[6] ^[9] Several upcoming developments described in the Universe launch blog post were never released, including environment integration tools so any user could contribute new integrations, and the public release of human demonstration data. ^[6] ^[9]

The core technical reasons were architectural. Because Universe drove real software over VNC rather than emulating it, every environment ran in real time and could not be sped up, the agent's view of game state came from noisy screen pixels rather than memory, and the overhead of integrating each new title was high. ^[11] Gym Retro addressed these by reading game state directly from emulator memory, which let environments run faster than real time and report exact rewards. ^[11]

How did Universe influence later work?

Even though the Universe codebase itself was abandoned, the ideas it tested have had a long afterlife. The Mini World of Bits subset became the seed for the World of Bits paper presented at ICML 2017 (pages 3135-3144) by Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, and Percy Liang. The paper introduced an open-domain platform for web-based agents, with crowdworkers writing natural language tasks and demonstrations on real websites, and HTTP traffic cached so the tasks could be replayed offline. ^[12] Announcing the work, Karpathy described it as a "Mini World of Bits project (agents learn to use the web) at OpenAI and how to use it with Universe." ^[18] That benchmark was later cleaned up and extended by Stanford researchers as MiniWoB++, with more than 100 web interaction tasks, and by 2022 had become a standard reference for browser-based LLM agents. ^[12] ^[17]

More recent web-agent benchmarks acknowledge Universe and MiniWoB as predecessors. WebArena, released in 2023, builds a self-hosted set of realistic websites for agents to navigate, citing Mini World of Bits as a foundational simplified benchmark. The same lineage runs through Mind2Web, VisualWebArena, and various "computer use" agents built on large language models, all of which can be read as successors to the basic Universe idea: give an agent pixels and a keyboard, then ask it to operate real software.

Universe also influenced OpenAI's own internal direction. The lessons about asynchronous execution and pixel-based state estimation pushed the company toward emulator-backed environments in Gym Retro and simulators with controlled tick rates in projects like the OpenAI Five Dota 2 effort. The goal of an agent that can flexibly apply prior experience to unfamiliar software has since become more reachable through large multimodal models rather than the from-scratch reinforcement learning Universe was designed for, and recent agent products for browsing and computer use trace a clear genealogy back to this 2016 platform.

References

OpenAI, "Universe," OpenAI blog, December 5, 2016. https://openai.com/index/universe/ ↩
SD Times, "OpenAI opens new AI universe," December 6, 2016. https://sdtimes.com/ai/openai-opens-new-ai-universe/ ↩
The Register, "Elon Musk-backed OpenAI reveals Universe, a universal training ground for computers," December 5, 2016. https://www.theregister.com/2016/12/05/openai_universe_reinforcement_learning/ ↩
Futurism, "A Whole New Universe: OpenAI Just Opened a School for AI," December 5, 2016. https://futurism.com/a-whole-new-universe-openai-just-opened-a-school-for-ai ↩
OpenAI, "Mini World of Bits benchmark," referenced in the Universe launch post, December 2016. https://openai.com/index/universe/ ↩
OpenAI Universe issue tracker, "This project is ABANDONED #235" and "is the project abandoned? #218," GitHub. https://github.com/openai/universe/issues/235 ↩
OpenAI Universe README, GitHub repository (archived). https://github.com/openai/universe ↩
OpenAI Gym documentation. https://gym.openai.com/ ↩
openai/universe GitHub repository, archived April 6, 2018. https://github.com/openai/universe ↩
Wikipedia, "OpenAI," history section. https://en.wikipedia.org/wiki/OpenAI ↩
OpenAI, "Gym Retro," OpenAI blog, April 5, 2018. https://openai.com/index/gym-retro/ ↩
Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, and Percy Liang, "World of Bits: An Open-Domain Platform for Web-Based Agents," ICML 2017. https://proceedings.mlr.press/v70/shi17a/shi17a.pdf ↩
PCWorld, "OpenAI releases Universe, a platform for training AIs to play games, use apps," December 5, 2016. https://www.pcworld.com/article/411207/openai-releases-universe-a-platform-for-training-ais-to-play-games-use-apps.html ↩
Teslarati, "Musk's OpenAI will train artificial intelligence through video game 'Universe'," December 2016. https://www.teslarati.com/openai-debuts-universe-training-environment/ ↩
ITPro, "OpenAI teaches artificial intelligence with Universe software," December 2016. https://www.itpro.com/strategy/27718/openai-teaches-artificial-intelligence-with-universe-software ↩
Futurism interview with Michael Bowling, December 2016. https://futurism.com/a-whole-new-universe-openai-just-opened-a-school-for-ai ↩
Farama Foundation, MiniWoB++ repository documentation. https://github.com/Farama-Foundation/miniwob-plusplus ↩
Andrej Karpathy, post on X (formerly Twitter), December 2016. https://x.com/karpathy/status/809889202120884224 ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

AI Wiki Artificial intelligence terms GitHub Copilot Greg Brockman Gym Retro OpenAI SIMA (DeepMind)Terms

What is Universe used for?

What were the system requirements?

How many environments did Universe ship with?

What was the goal of Universe?

When was Universe released?

What design properties did Universe emphasize?

Which companies let Universe use their software?

How was Universe received?

Why was Universe abandoned?

How did Universe influence later work?

See also

References

Improve this article

Related Articles

A*

LLM Anxiety

AI in transportation

AI Anxiety

AI Monarchy

AI Parasite

What links here

Related Articles

A*

LLM Anxiety

AI in transportation

AI Anxiety

AI Monarchy

AI Parasite

What links here