MuJoCo (short for Multi-Joint dynamics with Contact) is an open-source physics simulator designed for fast and accurate simulation of articulated mechanical systems with rich contact interactions. It is widely regarded as one of the most influential physics engines in modern robotics and reinforcement learning research, powering everything from classic continuous-control benchmarks to the simulation-based training pipelines used to teach humanoid robots like the Boston Dynamics Atlas and Unitree G1 to walk, run, and recover from falls.
First developed by Emanuel "Emo" Todorov at the University of Washington and described in a landmark 2012 paper, MuJoCo was commercialized by Roboti LLC and grew into the de facto standard simulator for academic robotics and RL research throughout the 2010s. In October 2021, DeepMind acquired the engine and made the binaries free of charge, and in May 2022 it released the full source code under the permissive Apache 2.0 license. Since then, MuJoCo has continued to evolve under joint stewardship from DeepMind and the broader open-source community, including a JAX-based GPU implementation called MJX, an NVIDIA-collaborative fork called MuJoCo Warp, and the MuJoCo Playground framework for sim-to-real robot learning.
MuJoCo is a general-purpose physics engine built specifically for the needs of model-based optimization, control, and machine learning. Unlike game-oriented physics engines that prioritize visual plausibility, MuJoCo prioritizes numerical accuracy, smoothness of dynamics, and computational throughput. Its core innovation is a soft, convex contact model that yields well-defined dynamics and inverse dynamics, making it possible to differentiate through the simulator, run long-horizon optimal control problems, and train deep RL policies that transfer reasonably well to real robots.
The engine is written in modern C and C++ for portability and speed. It exposes a low-level C API along with first-party Python bindings, and integrates with a large ecosystem of higher-level libraries including the DeepMind Control Suite, OpenAI Gym (now Gymnasium), Stable Baselines, RLlib, the Isaac ecosystem (via URDF interop), and many others. A typical CPU instance of MuJoCo can simulate millions of physics steps per second on simple humanoid models, and the GPU-accelerated MJX and Warp variants push that figure into the billions of steps per second when running thousands of parallel environments.
Key characteristics that distinguish MuJoCo from competing engines include:
MuJoCo grew out of work in the Movement Control Laboratory at the University of Washington, led by professor Emanuel Todorov. Todorov, a computational neuroscientist with a long-standing interest in biological motor control, needed a simulator that could be used inside the inner loop of optimal control algorithms. Existing tools at the time, such as the Open Dynamics Engine (ODE), were too slow, too noisy, or too brittle around contact events to support the kind of model-predictive control and trajectory optimization research his group was pursuing.
Development of an early prototype began around 2008 and continued for several years. The engine's central idea, a contact formulation expressed as a convex optimization problem rather than the more common linear or nonlinear complementarity problem, was both an academic novelty and a practical breakthrough. The convex formulation made it possible to solve contact dynamics with fast Newton-style methods, gave the engine well-defined inverse dynamics, and allowed derivatives of the dynamics to be computed by finite differences in a numerically clean way.
The simulator was first publicly described in the 2012 IROS paper MuJoCo: A physics engine for model-based control by Emanuel Todorov, Tom Erez, and Yuval Tassa. That paper has since been cited thousands of times and is widely considered foundational to the modern sim-to-real reinforcement learning literature.
In 2015, Todorov founded Roboti LLC to commercialize MuJoCo as a closed-source product. Licenses were sold to academic groups at modest prices and to industrial users at higher rates, with a free trial available for evaluation. During this period, MuJoCo became the simulator of choice for several influential research efforts:
Despite its scientific success, MuJoCo's closed-source license model and the practical hassle of dealing with mujoco-py (an unofficial Python wrapper that frequently broke) became sources of frustration in the community. Calls for an open-source release grew louder year over year.
In October 2021, DeepMind announced that it had acquired MuJoCo from Roboti LLC and would make the binaries available for free immediately, with full source code to follow. The acquisition was widely celebrated, especially in academic circles where licensing friction had long been a barrier for new researchers and students.
In May 2022, DeepMind released the complete MuJoCo source code on GitHub under the Apache 2.0 license. Emo Todorov continued to be involved as a consultant and the original codebase architect, while DeepMind took over as primary maintainer with a small dedicated team. The first-party Python bindings, built with pybind11, replaced the older mujoco-py package and became the recommended way to use MuJoCo from Python.
Since open-sourcing, MuJoCo has evolved rapidly. Major milestones include:
flex element, muscle actuators, and many usability improvements.A MuJoCo simulation is built around two main C structs:
This split between model and data is a deliberate design choice that supports parallel simulation by allowing many mjData instances to share a single mjModel. It also makes the engine easy to use as a callable function in optimization and learning loops.
MuJoCo represents the configuration of articulated systems in generalized coordinates (joint angles, free-body positions and orientations) rather than maximal coordinates (positions of every body in 3D space). This formulation, computed via the recursive Newton-Euler and composite-rigid-body algorithms, has several practical advantages:
MuJoCo's contact model is one of its defining features. Instead of treating contacts as hard, instantaneous, complementarity-based events (as ODE, Bullet, and most game engines do), MuJoCo formulates contact dynamics as a convex optimization problem based on a regularized variant of the Gauss principle of least constraint. The result is a model that is:
The engine offers a choice of three solvers for the contact optimization:
MuJoCo's native XML model format is called MJCF (MuJoCo Configuration Format). It is hierarchical, supports defaults and includes, and is significantly more expressive than the more common URDF format used by ROS and many other simulators. MJCF describes:
A URDF importer is included for compatibility with the broader ROS ecosystem, and many models in the MuJoCo Menagerie are derived from publicly available URDF descriptions and refined for MuJoCo-specific needs.
MuJoCo supports a single underlying general actuator model that can be configured to behave as a torque motor, position servo, velocity servo, integrated-velocity controller, viscous damper, hydraulic cylinder, biological muscle (with Hill-type force-length-velocity dynamics), or adhesion gripper. This unified model makes it straightforward to mix and match actuator types within a single robot.
The sensor system covers most of what robotics applications need: joint encoders, IMU components (accelerometers, gyroscopes, magnetometers), force-torque sensors, tactile arrays, range finders, and rendered RGB and depth cameras. Custom sensors can be added through the engine plugin system introduced in version 3.0.
MuJoCo is famously fast. On a single CPU thread, a model of the OpenAI Gym Humanoid (about 27 degrees of freedom) typically runs in the range of millions of physics steps per second when solver iteration counts are kept low and contact complexity is modest. On a single multi-core CPU machine, throughput in the tens of millions of steps per second is achievable for typical RL workloads.
GPU-accelerated variants push this much further. MJX runs on Nvidia and AMD GPUs, Apple Silicon, and Google TPUs, and works best when simulating thousands or tens of thousands of identical scenes in parallel. MuJoCo Warp is optimized specifically for NVIDIA GPUs and reports 70x to 313x speedups over MJX for typical locomotion and manipulation workloads, making single-GPU training of complex humanoid policies practical.
The official mujoco Python package (available on PyPI) provides direct, near-zero-overhead bindings to the C API. It includes:
model.body('torso').id, data.joint('hip').qpos<sup><a href="#cite_note-0" class="cite-ref">[0]</a></sup>) that eliminates the need to manually look up indices.mujoco.viewer) that lets users load a model and play with it from a script or notebook.A companion mujoco-mjx package provides the JAX-based GPU implementation, with APIs that mirror the CPU bindings as closely as possible while being fully jit, vmap, and grad compatible.
Around the core, a rich ecosystem of tools has emerged, including the DeepMind Control Suite (dm_control), Robosuite, ManiSkill, Gymnasium-Robotics, RoboHive, and many specialized RL training libraries.
The original OpenAI Gym MuJoCo environments, written in 2016, have become an unofficial standard for evaluating continuous-control RL algorithms. They include:
| Environment | Description | Typical use |
|---|---|---|
| Hopper | A planar one-legged robot that must balance and hop forward. | Testing exploration and balance. |
| Walker2d | A planar bipedal walker. | Bipedal locomotion benchmarks. |
| HalfCheetah | A 2D cheetah-shaped runner with 9 links and 8 joints. | High-speed locomotion, reward shaping. |
| Ant | A 3D quadruped resembling an insect. | 3D locomotion, rough terrain. |
| Humanoid | A 27-DoF anthropomorphic figure. | High-dimensional control, whole-body coordination. |
| Swimmer | A planar 3-link snake-like swimmer. | Periodic motion, low contact. |
| Reacher | A 2-link arm that reaches to a goal. | Quick benchmarks, debugging. |
| Pusher | A 7-DoF arm that pushes an object to a target. | Manipulation. |
Nearly every major deep RL paper of the past decade has reported scores on these environments. The original PPO paper, the SAC paper, the TD3 paper, the DDPG paper, the TRPO paper, the GAE paper, and many others all use MuJoCo Gym environments as primary benchmarks.
Released in 2018, the DeepMind Control Suite (dm_control) is a curated collection of continuous control tasks built on MuJoCo with a more standardized structure than Gym. Each task has a consistent observation and action interface, normalized rewards in the [0, 1] range, and clearly documented physical assumptions. Categories include locomotion (cartpole, cheetah, hopper, walker, humanoid, fish, swimmer), manipulation (manipulator, finger, stacker), and motor learning (reacher, ball-in-cup, pendulum).
The dm_control package also ships with PyMJCF, a Python DOM for MJCF, and Composer, a higher-level scene composition system used to build more complex environments such as the Locomotion suite and the Manipulation suite with a robot arm and snap-together bricks.
The table below highlights several seminal RL algorithms that were validated on MuJoCo benchmarks.
| Algorithm | Year | Type | MuJoCo benchmarks used |
|---|---|---|---|
| TRPO | 2015 | On-policy policy gradient | HalfCheetah, Walker2d, Hopper |
| DDPG | 2015 | Off-policy actor-critic | Pendulum, HalfCheetah, Reacher |
| PPO | 2017 | On-policy policy gradient | HalfCheetah, Hopper, Walker2d, Humanoid, Ant |
| SAC | 2018 | Off-policy maximum-entropy | Humanoid, Ant, HalfCheetah, Hopper, Walker2d |
| TD3 | 2018 | Off-policy actor-critic | Same as SAC |
| D4PG | 2018 | Distributional off-policy | dm_control suite |
| MPO | 2018 | EM-style policy iteration | dm_control suite |
On the MuJoCo Ant-v4 environment, comparative studies typically report TD3 and SAC as the top performers, with average rewards on the order of 3,000 to 4,000 over a few thousand episodes. PPO is more sensitive to hyperparameters but remains the workhorse for large-scale and parallelized training.
Sim-to-real transfer is the practice of training a control policy entirely in simulation and then deploying it directly on a physical robot. MuJoCo has played a central role in this paradigm because of two properties: it is fast enough to generate the millions or billions of simulated experiences modern RL needs, and its dynamics are smooth and well-defined enough that policies trained against it generalize reasonably well to reality, especially when domain randomization is applied to simulator parameters such as friction, mass, motor gains, and sensor noise.
A 2021 academic study comparing MuJoCo, PyBullet, and ODE on transfer experiments found that policies trained in MuJoCo were better at generalizing to other engines (and presumably to reality) than policies trained in the alternatives, attributing the result to MuJoCo's smoother contact handling.
Boston Dynamics and the Robotics & AI Institute (RAI) have publicly described training pipelines for the Atlas and Spot robots that rely on RL in massively parallel simulation. The published RAI/Boston Dynamics work cites the use of over 150 million simulation runs per maneuver, with policies deployed zero-shot onto hardware. While Boston Dynamics has historically used a mix of internal and external simulators, MuJoCo and its GPU-accelerated descendants are widely used in this space, and MuJoCo Playground explicitly includes Boston Dynamics Spot as one of its supported quadrupeds.
Introduced by DeepMind in January 2025, MuJoCo Playground is an open-source framework for GPU-accelerated robot learning and sim-to-real transfer. Built on top of MJX (and now also MuJoCo Warp), Playground bundles a curated set of robots, training environments, and reward functions that can be installed with a single pip install playground command. Researchers can train a usable locomotion policy in minutes on a single GPU.
Supported robots in Playground include:
Playground demonstrates zero-shot sim-to-real transfer using both state-based and pixel-based observations, including joystick locomotion, fall recovery, and even handstand policies on the Unitree Go1.
Humanoid robots like the Tesla Optimus, Figure 02, 1X Neo, Sanctuary Phoenix, and Apptronik Apollo all rely on simulation-trained policies to handle locomotion and manipulation. While each company uses its own internal stack, the broader pattern is identical: build a high-fidelity model of the robot in MJCF (or import from URDF), train an RL policy using PPO or a related algorithm in massively parallel simulation, apply domain randomization, and deploy the policy zero-shot or with brief on-robot fine-tuning. MuJoCo and MJX are major backbones of these pipelines, alongside NVIDIA Isaac Sim and Isaac Lab.
| Simulator | Developer | License | Strengths | Weaknesses | Typical use |
|---|---|---|---|---|---|
| MuJoCo | DeepMind (originally Roboti LLC) | Apache 2.0 | Fast, accurate, smooth contact, analytical inverse dynamics, excellent Python bindings, MJX/Warp GPU variants. | Single-environment GPU performance is weak compared with Isaac, contact stiffness sometimes requires tuning for legged robots. | RL benchmarks, sim-to-real, biomechanics, model-based control. |
| NVIDIA Isaac Sim / Isaac Lab | NVIDIA | Proprietary (free for many uses) | Massive GPU parallelism, photorealistic rendering, ROS 2 integration, USD/OpenUSD scene format. | Requires recent NVIDIA GPUs, heavyweight install, single-environment overhead 10-20x higher than MuJoCo. | Industrial robotics, large-scale RL with sensor-rich observations. |
| PyBullet (Bullet) | Erwin Coumans (originally) | zlib | Free, Python-friendly, decent speed, good for prototyping. | Less accurate than MuJoCo, contact tuning can be fiddly, no first-class GPU version. | Academic prototyping, soft-body experiments, education. |
| Gazebo / Ignition | Open Robotics | Apache 2.0 | Deep ROS integration, plugin ecosystem, built for full system simulation including sensors. | Slow for RL training, complex setup, multiple physics backends with varying quality. | Whole-system robotics integration, ROS-based development. |
| Drake | Toyota Research Institute / MIT | BSD-3 | Rigorous numerics, designed for control and planning research, hydroelastic contact model. | Steeper learning curve, smaller community, slower than MuJoCo for RL workloads. | Model-based control, manipulation research, formal verification. |
| RaiSim | Jemin Hwangbo (ETH Zurich) | Free for academic, paid for commercial | Very fast for legged robots, accurate contact model. | Restrictive license, smaller ecosystem, no first-class GPU version. | Legged robot research, ANYmal-style locomotion. |
| Genesis | Embodied AI collective | Apache 2.0 | Unifies rigid, MPM, SPH, FEM, PBD solvers; claims very high GPU throughput. | Newer and less battle-tested, accuracy claims still being validated by the community. | Multi-physics RL, generative scenes, embodied AI research. |
| MuJoCo MJX | DeepMind | Apache 2.0 | Same MuJoCo dynamics, runs on JAX (GPU/TPU), excellent for batched RL. | 10x slower than CPU MuJoCo for single-environment runs, JIT compile times of 1-3 minutes. | Massively parallel RL, sim-to-real with PPO at scale. |
| MuJoCo Warp | DeepMind + NVIDIA | Apache 2.0 | Up to 152x-313x faster than MJX on RTX-class GPUs, same MuJoCo semantics. | Beta as of 2025, still feature-incomplete relative to CPU MuJoCo. | Cutting-edge sim-to-real, fastest current MuJoCo-compatible GPU option. |
The following table summarizes major MuJoCo releases and ecosystem milestones.
| Year | Release / event | Notes |
|---|---|---|
| 2008-2011 | Internal prototypes | Developed in Todorov's Movement Control Lab, University of Washington. |
| 2012 | Public IROS paper | Todorov, Erez, Tassa publish MuJoCo: A physics engine for model-based control. |
| 2015 | Roboti LLC founded | MuJoCo becomes a commercial product with academic and industrial licenses. |
| 2016 | OpenAI Gym MuJoCo envs | HalfCheetah, Hopper, Walker2d, Ant, Humanoid become RL standards. |
| 2018 | MuJoCo 2.0 | Major release; widely adopted. DeepMind Control Suite released. |
| 2021 | Activation key removed (2.1.0) | License-free use without an activation key for the first time. |
| Oct 2021 | DeepMind acquisition | Binaries become free, source release announced. |
| May 2022 | Apache 2.0 source release | Full source published on GitHub; first-party Python bindings replace mujoco-py. |
| 2022 | MuJoCo Menagerie | Curated collection of high-quality robot models published. |
| 2023 | MJX | JAX-based GPU/TPU implementation released. |
| Oct 2023 | MuJoCo 3.0 | SDF collisions, deformable flex elements, native muscle actuators. |
| 2024 | MuJoCo 3.2.x | Native convex collision detection becomes default. |
| Jan 2025 | MuJoCo Playground | DeepMind framework for sim-to-real RL on MJX. |
| 2025 | MuJoCo Warp | Beta release of NVIDIA-collaborative GPU rewrite using Warp. |
| Jun 2025 | RSS 2025 demo | MuJoCo Playground wins Outstanding Demo Paper Award. |
| 2025-2026 | Ongoing | Continued integration with NVIDIA's Newton physics engine and broader robotics stacks. |
The MuJoCo Menagerie is an official DeepMind-curated collection of high-quality MJCF models, intended to provide reliable starting points for research. As of 2026, the collection includes dozens of robots across multiple categories:
| Category | Examples |
|---|---|
| Quadrupeds | Unitree A1, Go1, Go2; Boston Dynamics Spot; ANYmal B and C; Google Barkour v0 and vB. |
| Humanoids | Unitree H1 and G1; Berkeley Humanoid; Booster T1; Apptronik Apollo; Robotis OP3. |
| Robotic arms | Franka Emika Panda; Universal Robots UR5e and UR10e; Kinova Gen3; Kuka iiwa14. |
| Dexterous hands | Shadow Hand; LEAP Hand; Allegro Hand. |
| Bimanual systems | ALOHA 2; YuMi. |
| Mobile manipulators | Stretch RE1; Tiago; Hello Robot Stretch 3. |
| Others | RealSense cameras as standalone models, gripper attachments, terrain assets. |
A growing subset of these models has been validated as MJX-compatible, meaning they can be loaded and simulated in vectorized batches on GPU without modification.
| Benchmark / project | Year | Description |
|---|---|---|
| OpenAI Gym MuJoCo | 2016 | Standard continuous-control RL benchmarks (HalfCheetah, Hopper, Walker2d, Ant, Humanoid, etc.). |
| OpenAI Dactyl | 2018-2019 | Shadow Hand learns to manipulate a Rubik's cube; trained primarily in MuJoCo with domain randomization. |
| DeepMind Control Suite | 2018 | Curated standardized RL benchmarks across locomotion and manipulation. |
| MetaWorld | 2019 | 50-task multi-task and meta-learning manipulation benchmark. |
| Robosuite | 2020 | Standardized robot learning environment built on MuJoCo for manipulation research. |
| D4RL | 2020 | Offline RL benchmarks including MuJoCo locomotion datasets. |
| SimBenchmark (legged) | 2020 | ETH Zurich comparison of MuJoCo, RaiSim, Bullet, ODE, DartSim on legged tasks. |
| dm_control Locomotion | 2020+ | Humanoid and dog-like locomotion suites for RL research. |
| Gymnasium-Robotics | 2023+ | Maintained successor to legacy Gym Robotics environments. |
| MuJoCo Playground | 2025 | DeepMind's GPU-accelerated sim-to-real framework with quadrupeds, humanoids, and arms. |
flex element provides; multi-physics workflows generally require other tools or hybrid pipelines.Since DeepMind's open-sourcing in 2022, MuJoCo development has happened in the open on GitHub, with a small core team of DeepMind engineers acting as primary maintainers and a much larger community of contributors. Issues and discussions in the google-deepmind/mujoco repository are actively triaged, and major architectural decisions are typically explained in long-form discussion threads. Emo Todorov has continued to advise on physics and modeling decisions, while DeepMind contributes the engineering effort needed to maintain Python bindings, MJX, the Menagerie, and Playground.
In 2025, DeepMind and NVIDIA jointly announced Newton, an open-source physics engine that builds on Warp and MuJoCo Warp, with collaboration from Disney Research. Newton is positioned as the long-term unified physics backbone for both MuJoCo Warp and NVIDIA's Isaac stack, suggesting that the two ecosystems are converging at the engine level even as their higher-level tooling remains distinct.
MuJoCo is cited in over 3,500 academic papers and is the simulation backbone of countless RL libraries, including Stable Baselines, RLlib, ACME, JaxRL, CleanRL, and Tianshou. It is taught in robotics and RL courses at most major universities and is the default simulator behind dozens of competitions and benchmarks in the robot learning community.
In industry, MuJoCo's footprint has grown substantially since the open-source release. Humanoid startups, autonomous-driving companies, and robotics teams at large technology firms now routinely use MuJoCo and MJX as part of their internal training stacks, often alongside (rather than instead of) Isaac Sim. The combination of free licensing, strong Python tooling, and a track record of successful sim-to-real transfer makes MuJoCo a natural choice for both early-stage prototyping and large-scale production training pipelines.