Spinning Up
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,120 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,120 words
Add missing citations, update stale details, or suggest a clearer explanation.
Spinning Up in Deep RL is a free, open-source educational resource produced by OpenAI to make deep reinforcement learning (deep RL) easier to learn. Released on November 8, 2018, and written primarily by Joshua Achiam, it combines a written introduction to RL theory, advice for aspiring researchers, a curated reading list, well-documented standalone implementations of several core algorithms, and a set of exercises. The material is hosted at spinningup.openai.com, and the accompanying code lives in the GitHub repository openai/spinningup under an MIT license.[1][2][3]
OpenAI created Spinning Up to address a perceived gap: although introductory resources for deep learning were plentiful, deep reinforcement learning remained comparatively difficult to enter. The project's stated goal is to help people learn to use deep RL techniques and to develop intuitions about how and why they work.[1][3]
The resource grew directly out of OpenAI's talent-development efforts. The organization observed through its OpenAI Scholars and OpenAI Fellows programs that people with little or no prior machine learning experience could become competent practitioners relatively quickly given the right guidance, and Spinning Up packaged that guidance for a wider audience. The material was subsequently incorporated into the curriculum for the 2019 cohorts of Scholars and Fellows.[1]
The lead author is Joshua (Josh) Achiam, who at the time of release was a research scientist on OpenAI's safety team and a PhD student at the University of California, Berkeley, advised by Pieter Abbeel. His research focused on safety in deep reinforcement learning, including work on safe exploration.[2][4]
Spinning Up is organized into written documentation paired with runnable code. Its principal components are:[3][5]
The repository also ships utilities for logging, plotting results, and running experiments, including support for parallelized execution.[5]
Spinning Up provides implementations of six core deep RL algorithms, grouped into on-policy and off-policy families. At launch all were written using TensorFlow v1. On January 30, 2020, release 0.2 added PyTorch implementations for every algorithm except Trust Region Policy Optimization, which remains available only in the TensorFlow version. Both code bases are maintained side by side.[5][8]
| Algorithm | Abbreviation | Family | Frameworks |
|---|---|---|---|
| Vanilla Policy Gradient | VPG | On-policy | TensorFlow, PyTorch |
| Trust Region Policy Optimization | TRPO | On-policy | TensorFlow |
| Proximal Policy Optimization | PPO | On-policy | TensorFlow, PyTorch |
| Deep Deterministic Policy Gradient | DDPG | Off-policy | TensorFlow, PyTorch |
| Twin Delayed DDPG | TD3 | Off-policy | TensorFlow, PyTorch |
| Soft Actor-Critic | SAC | Off-policy | TensorFlow, PyTorch |
The documentation frames VPG, TRPO, and PPO as a progression of on-policy methods that successively improve stability and sample efficiency, while DDPG, TD3, and SAC are off-policy methods that reuse past data, with TD3 and SAC presented as descendants of DDPG that incorporate additional techniques to address stability problems.[5]
To support the launch, OpenAI offered a period of high-bandwidth support in November 2018, during which the author and others answered questions from learners.[1] OpenAI also hosted a Spinning Up Workshop at its San Francisco office on February 2, 2019. Roughly 90 people attended in person, with close to 300 more following via livestream; attendees came from backgrounds spanning academia, software engineering, data science, machine learning engineering, medicine, and education. The morning featured talks by Joshua Achiam on the conceptual foundations and taxonomy of RL, Matthias Plappert on OpenAI's dexterous-robot-hand manipulation work, and Dario Amodei, then leader of OpenAI's safety team, on problems in AI safety. Afternoon breakout sessions, staffed by volunteer instructors, helped participants with implementation and research ideas.[9]
Spinning Up has been widely cited as a reference point for newcomers to the field and has inspired community ports and clones. The GitHub repository is in maintenance status, meaning it receives bug fixes and minor updates rather than new features.[3]
Spinning Up is distinct in purpose from OpenAI's two other major RL software releases, though they are often used together.[1][5]
| Project | Purpose | Audience |
|---|---|---|
| OpenAI Gym | A standardized toolkit of environments for developing and benchmarking RL algorithms | All RL practitioners |
| OpenAI Baselines | A set of high-quality reference implementations of RL algorithms, optimized for performance | Researchers reproducing or building on state-of-the-art results |
| Spinning Up | Educational, readable implementations and written material for learning deep RL | Learners and aspiring researchers |
Where Baselines emphasizes performance and feature completeness, Spinning Up deliberately favors short, transparent code meant to be read and understood, and it relies on Gym environments (and compatible robotics environments) for training and evaluation. In its researcher essay, OpenAI explicitly recommends reading the Baselines code to learn engineering best practices once a learner has grasped the fundamentals.[5][6]