Sim-to-real transfer

Reinforcement Learning Robotics

23 min read

Updated Jun 21, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 21, 2026

Fact-checked

In review queue

Sources

21 citations

Revision

v7 · 4,561 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Sim-to-real transfer (also written as sim2real) is the process of training a policy, controller, or perception model inside a physics simulator and deploying it on a physical robot in the real world without retraining from scratch. Because collecting training data on real hardware is slow, expensive, and risky, researchers first train agents in simulation, where millions of trials can run in parallel and failures cost nothing. The central obstacle is the reality gap: the differences between the simulated and physical environments that cause a policy working perfectly in simulation to fail, sometimes catastrophically, on actual hardware. Studies have documented zero-shot performance drops of 24 to 30 percent when policies transfer without any mitigation, and some tasks fail entirely.^[6] The defining insight of the field, stated in the foundational 2017 domain randomization paper, is that "with enough variability in the simulator, the real world may appear to the model as just another variation."^[1]

Sim-to-real transfer sits at the intersection of reinforcement learning, robot learning, computer vision, and control theory. It has become one of the most active research areas in robotics and embodied AI, with applications ranging from quadruped locomotion and dexterous manipulation to autonomous driving and drone navigation.^[7]

Why train robots in simulation?

Training robots directly in the real world presents several practical problems. Physical hardware wears out, collisions can damage expensive equipment, and each trial takes real time. A single reinforcement learning run might require millions of episodes, which could take years of continuous wall-clock time on a real robot. Simulation removes all of these constraints. Modern GPU-accelerated simulators can run thousands of parallel environments at speeds far exceeding real time.

The idea of training in simulation and transferring to reality dates back to early work in neural networks and control systems in the 1990s. However, the field accelerated rapidly after 2015 due to three converging trends: advances in deep reinforcement learning (deep RL), the availability of fast GPU-based physics engines, and the growing interest in deep learning-based perception for robotics.^[7]

What is the reality gap?

The reality gap refers to the mismatch between what a simulator models and what actually happens in the physical world. This mismatch arises from several sources:

Source of gap	Description	Example
Physics modeling errors	Simulators approximate physical phenomena with simplified equations. Contact dynamics, friction, and deformation are especially hard to model accurately.	A simulated gripper slides smoothly along a surface, but the real gripper sticks due to unmodeled surface roughness.
Actuator dynamics	Real motors have delays, backlash, nonlinear torque curves, and temperature-dependent behavior that simulators often ignore or approximate.	A simulated joint responds instantly to a torque command, but the real servo has a 10 ms delay and a deadband region.
Sensor noise and bias	Real cameras, IMUs, and force sensors produce noisy, sometimes biased readings. Simulated sensors are often idealized.	A depth camera produces accurate point clouds in simulation but returns noisy data with missing pixels on real hardware.
Visual appearance	Rendered images differ from real photographs in lighting, texture, reflections, and color.	A policy trained on procedurally generated textures in simulation fails when it encounters glossy or transparent real-world objects.
Unmodeled phenomena	Many real-world effects, such as cable tangling, air currents, or compliant contacts, are absent from simulation entirely.	A drone policy ignores ground effect because the simulator does not model it.

Studies have documented performance drops of 24 to 30 percent when policies are transferred without any mitigation, with some tasks failing entirely.^[6] The research community has developed multiple strategies to close or work around this gap.

Core techniques

Domain randomization

Domain randomization is the most widely used technique for sim-to-real transfer. Rather than trying to make the simulator perfectly match reality, domain randomization varies the simulation parameters during training so that the policy learns to handle a wide range of conditions. The hope is that the real world falls within the distribution of randomized environments.^[9]

Josh Tobin and colleagues at OpenAI introduced the term in 2017, applying it to object detection for robotic grasping.^[1] The original paper randomized visual properties: object positions, textures, lighting, and camera angles. When trained across enough visual variation, a neural network learned features robust enough to work on real camera images without fine-tuning. The authors framed the core hypothesis directly: "With enough variability in the simulator, the real world may appear to the model as just another variation."^[1]

Domain randomization divides into two main categories:

Visual randomization varies the appearance of the simulated scene. Parameters include object colors and textures, lighting direction and intensity, camera position and field of view, background imagery, and image noise patterns. This approach is most relevant for vision-based policies that take RGB or depth images as input.

Dynamics randomization varies the physical properties of the simulation. Parameters include masses and inertias of objects and robot links, friction coefficients and restitution (bounciness), actuator gains and delays, joint damping and stiffness, and observation noise levels. Xue Bin Peng and colleagues demonstrated dynamics randomization for locomotion transfer in 2018, showing that a policy trained with randomized physics could transfer zero-shot to a real robot arm for an object-pushing task.^[2]

Automatic domain randomization (ADR)

Setting randomization ranges by hand is tedious and error-prone. If ranges are too narrow, the real world falls outside the training distribution. If they are too wide, the learning problem becomes needlessly difficult. Automatic Domain Randomization, developed by OpenAI in 2019, addresses this by progressively expanding randomization ranges during training.^[3] When the agent achieves a performance threshold at the current difficulty level, ADR widens the parameter distributions. This creates a curriculum of increasing difficulty and was a central innovation behind OpenAI's Rubik's Cube result.^[3]

Bayesian domain randomization

Bayesian Domain Randomization (BayRn) uses Bayesian optimization to search the space of randomization distribution parameters. Instead of uniform sampling, BayRn adapts the source domain distribution by collecting data from the real target domain and finding parameter settings that maximize real-world performance.^[18]

Domain adaptation

Domain adaptation takes a different approach: instead of randomizing the source domain, it transforms simulated data to look more like real data (or vice versa). The goal is to learn representations that are invariant to the domain shift.^[15]

GAN-based adaptation uses generative adversarial networks to translate simulated images into realistic-looking images. CycleGAN, which performs unpaired image-to-image translation, has been widely applied. The cycle-consistency loss ensures that an image translated from simulation to reality and back again matches the original, preserving the underlying structure. More recent variants include RL-CycleGAN, which jointly trains the image translator with a reinforcement learning agent, and RetinaGAN, which adds an object detection consistency loss to preserve semantic content during translation.^[16]

Feature-level adaptation learns to map simulated and real observations into a shared latent space where they are indistinguishable. This can be done with adversarial training (a discriminator tries to tell simulated from real features) or with explicit feature matching losses. Language-based pretraining has shown effectiveness here: using natural language to guide image encoders toward learning domain-invariant visual features while ignoring domain-specific details such as texture or lighting. Zero-shot performance improvements of 25 to 40 percent have been reported for object manipulation tasks using this approach.

System identification

System identification (SysID) takes the opposite approach from domain randomization. Instead of training the policy to be robust to uncertainty, SysID tries to make the simulator as accurate as possible by measuring the real system's physical parameters.

Traditional SysID involves carefully measuring masses, moments of inertia, friction coefficients, actuator transfer functions, and sensor characteristics, then configuring the simulator to match. When done well, this can produce highly accurate simulations. The downside is that it requires significant manual effort and specialized equipment, and even careful measurements cannot capture all real-world effects.

Modern approaches automate this process. Iterative Residual Tuning (IRT) is a deep learning-based SysID method that adjusts simulator parameters to better match real-world observations using minimal data. A 2024 approach uses in-context learning to dynamically adjust simulation environment parameters online, leveraging past interaction histories as context to adapt simulation dynamics to real-world dynamics without requiring gradient updates.

Teacher-student distillation

A common sim-to-real pipeline involves training two policies. The teacher policy trains in simulation with access to privileged information: perfect state knowledge, ground-truth object poses, and exact physical parameters that would be unavailable on a real robot. Because it has access to this privileged information, the teacher can learn a high-quality policy relatively quickly.

The student policy then learns to imitate the teacher's behavior using only observations available from real-world sensors (cameras, joint encoders, IMUs). The distillation process uses behavior cloning, minimizing the difference between teacher and student actions. This approach was used in ETH Zurich's work on ANYmal quadruped locomotion and has become standard practice in legged robotics.^[4]

The TWIST framework (Teacher-Student World Model Distillation) extends this to model-based RL by distilling not just the policy but an entire world model from privileged state observations to image observations.

Curriculum learning and reward shaping

Curriculum learning structures the training process from simple to complex scenarios. For sim-to-real transfer, this means the agent first learns basic skills in easy environments and gradually encounters harder, more realistic conditions. This approach helps avoid the problem of learning degenerate strategies in overly randomized environments.

Reward shaping provides additional reward signals that guide the agent toward behaviors that transfer well. For locomotion, this might include penalties for jerky motions or high joint velocities, which tend to exploit simulation artifacts. Combining domain randomization with curriculum learning and careful reward shaping has produced some of the most reliable sim-to-real results.

Real-to-sim-to-real and digital twins

A growing trend is the real-to-sim-to-real pipeline, where the simulator is constructed from real-world data rather than hand-authored. This can involve 3D scanning of the environment, calibrating physics parameters from recorded robot trajectories, and continuously updating the simulation to match reality.

MIT's RialTo system creates digital twins on the fly using computer vision, allowing robots to train in environments that closely match their actual deployment setting. The Real-is-Sim framework from 2025 uses an Embodied Gaussian simulator that synchronizes with the real world at 60 Hz, allowing policies to seamlessly switch between running on real hardware and running in simulation. This dynamic digital twin approach has shown promise for evaluation and rapid iteration.

Which simulators are used for sim-to-real transfer?

The choice of simulator affects both the quality and efficiency of sim-to-real transfer. The following table compares the major platforms used in the field as of 2025.

Simulator	Developer	Physics engine	GPU acceleration	Open source	Key strengths
Isaac Sim / Isaac Lab	NVIDIA	PhysX 5	Yes	Yes (since 2025)	Photorealistic rendering, thousands of parallel environments, tight integration with NVIDIA hardware
MuJoCo	Google DeepMind (originally Todorov et al.)	Custom	Yes (via MJX/JAX)	Yes (Apache 2.0, since 2022)	Fast, accurate contact dynamics, lightweight, widely used in RL research
PyBullet	Erwin Coumans	Bullet	No	Yes	Easy to use, large community, good documentation, widely used for benchmarks
SAPIEN / ManiSkill	UC San Diego / Hillbot	PhysX 5 + Warp	Yes	Yes	Articulated object manipulation, heterogeneous GPU simulation, tactile sensing
Genesis	Genesis-Embodied-AI	Custom (differentiable)	Yes	Yes	Extremely fast (claims 10-80x over Isaac Gym), differentiable physics, generative capabilities
Newton	NVIDIA, Google DeepMind, Disney Research	Built on NVIDIA Warp	Yes	Yes (Linux Foundation, 2025)	Contact-rich simulation, open governance, built specifically for robot learning

MuJoCo

MuJoCo (Multi-Joint dynamics with Contact) was originally developed by Emanuel Todorov, Tom Erez, and Yuval Tassa at the University of Washington and described in a 2012 paper.^[8] It was commercialized under Roboti LLC in 2015 and became the de facto standard for RL research. Google DeepMind acquired MuJoCo in October 2021 and released it as open source under the Apache 2.0 license in May 2022. MuJoCo's strengths are fast, stable contact simulation and a lightweight C codebase. MJX, a JAX-based reimplementation, enables GPU-accelerated parallel simulation.

NVIDIA Isaac Sim and Isaac Lab

NVIDIA's Isaac platform provides a full simulation and training pipeline for robotics. Isaac Sim offers photorealistic rendering through ray tracing and can simulate complex scenes with deformable objects, fluids, and cloth. Isaac Lab is a lightweight, GPU-accelerated application built on Isaac Sim that is optimized for running thousands of parallel robot learning environments. Isaac Sim 5.0 was released as open source in 2025.

Newton

Newton is an open-source, GPU-accelerated physics engine co-developed by NVIDIA, Google DeepMind, and Disney Research.^[11] Announced at GTC 2025 by Jensen Huang, Newton was contributed to the Linux Foundation in September 2025.^[11] Built on NVIDIA Warp and OpenUSD, it is designed for contact-rich robot behaviors such as walking on varied terrain and manipulating delicate objects.^[13] Disney uses Newton to power its next-generation entertainment robots, including the Star Wars-inspired BDX droids.^[13]

Genesis

Genesis is an open-source physics simulation platform designed for general-purpose robotics and embodied AI.^[19] It integrates multiple physics solvers (rigid body, soft body, cloth, fluid) into a unified framework. Genesis claims extremely high simulation speeds, citing 43 million FPS for a manipulation scene with a Franka arm, which would be 430,000 times faster than real time.^[19] Its differentiable physics engine supports gradient-based optimization. Genesis also includes a generative data engine that can produce training data from natural language descriptions.

SAPIEN and ManiSkill

SAPIEN focuses on articulated object manipulation, providing GPU-parallelized simulation of robots interacting with drawers, faucets, and other jointed objects. The ManiSkill framework, built on SAPIEN, is one of the fastest GPU-parallelized robotics simulators for contact-rich manipulation tasks, supporting RGBD data collection at 30,000+ FPS on a single RTX 4090.^[20] ManiSkill3 is unique in supporting heterogeneous GPU simulation, meaning different parallel environments can contain different object geometries and articulation structures.^[20]

Notable successes

OpenAI Rubik's Cube (2019)

One of the most widely publicized sim-to-real demonstrations was OpenAI's use of a Shadow Dexterous Hand to solve a Rubik's Cube using a single robotic hand. The entire control policy was trained in simulation using approximately 13,000 simulated years of experience, generated on a cluster of 64 NVIDIA V100 GPUs and roughly 920 worker machines with 32 CPU cores each.^[3] The breakthrough relied on Automatic Domain Randomization, which progressively expanded the range of physical parameters during training. The system used a PhaseSpace motion capture setup and RGB cameras for state estimation. OpenAI described the result as showing that "models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot."^[3]

The robot solved the Rubik's Cube about 60 percent of the time overall and 20 percent of the time for maximally difficult scrambles. The trained policy was robust to significant perturbations: the robot could still solve the cube while wearing a rubber glove, with several fingers taped together, or while being poked with a stuffed toy. This demonstrated that aggressive domain randomization could produce policies with genuine robustness rather than narrow simulation-specific skills.^[3]

ANYmal quadruped locomotion (2020)

Joonho Lee and colleagues at ETH Zurich's Robotic Systems Lab demonstrated sim-to-real transfer for the ANYmal quadruped robot over challenging natural terrain. Their approach used a two-stage teacher-student framework: a teacher policy trained in simulation with privileged terrain information, and a student policy that used only proprioceptive feedback (joint positions, velocities, and IMU data). The student policy was deployed zero-shot on the real robot.^[4]

The ANYmal robot demonstrated locomotion over terrain never encountered during training, including mud, snow, rubble, thick vegetation, and flowing water. As the authors put it, "some of the most challenging environments on our planet are accessible to quadrupedal animals but remain out of reach for autonomous machines," and they presented "a radically robust controller for legged locomotion in challenging natural environments."^[4] The work was published in Science Robotics and has become a benchmark for sim-to-real locomotion research.^[4] Subsequent work has extended this approach to parkour-style agile locomotion for quadrupeds.

Google QT-Opt (2018)

Google Brain's QT-Opt demonstrated a different philosophy: using both simulated and real data for large-scale robotic grasping. The system trained a deep neural network Q-function on over 580,000 real-world grasp attempts collected from seven robots, supplemented with simulated data. The resulting policy achieved a 96 percent grasp success rate on previously unseen objects.^[5]

While QT-Opt was not purely sim-to-real (it used substantial real data), it demonstrated how simulation could augment real-world data collection and showed the potential of scalable robot learning pipelines.

Locomotion on Unitree robots

The Unitree A1, Go2, and other quadrupeds have become popular platforms for sim-to-real locomotion research. The Unified Locomotion Transformer (ULT), published in 2025, uses a transformer architecture for simultaneous optimization of teacher and student policies, significantly reducing the data needed for sim-to-real transfer. The policy was validated on a Unitree A1 with a Jetson AGX Orin. Other work has demonstrated loco-manipulation (locomotion combined with manipulation) on the Unitree B1 quadruped with a Z1 arm, using Isaac Gym for training and deploying through a hardware abstraction layer.

NVIDIA Isaac GR00T and humanoid robots

The emergence of humanoid robots has created strong demand for sim-to-real methods. NVIDIA announced Isaac GR00T N1 in March 2025 as the first open, fully customizable foundation model for humanoid robot reasoning and skills.^[10] The GR00T N1.6 update integrates multimodal vision-language-action policies with world models such as NVIDIA Cosmos Reason, enabling end-to-end loco-manipulation and reasoning tasks.^[12]

The sim-to-real workflow for GR00T leverages whole-body reinforcement learning in Isaac Lab and synthetic data-driven navigation.^[12] NVIDIA reported generating 780,000 synthetic trajectories (equivalent to 6,500 hours of human demonstration data) in just 11 hours. Combining this synthetic data with real data improved the GR00T N1 performance by 40 percent compared to using only real data, demonstrating the value of simulation-generated training data even when real demonstrations are available.^[12]

Foundation models and sim-to-real

The convergence of large language models, vision-language models, and robotics has introduced new approaches to sim-to-real transfer. Vision-Language-Action (VLA) models like Google's RT-2 and Gemini Robotics process multimodal data (text, images, video, and audio) and output robot actions directly. Google DeepMind launched Gemini Robotics and Gemini Robotics-ER, both built on Gemini 2.0, on March 12, 2025, followed by an on-device variant on June 24, 2025.^[21] These models can leverage foundation model representations that provide consistent semantic features across simulation and reality, potentially reducing the visual domain gap.

Simulation plays a growing role in training these models. Skild AI, for example, reported training on 100,000 different robot embodiments generated in simulation, aiming to build policies that generalize across robot body types. The use of large-scale simulation data to pretrain or augment VLA models represents a new frontier in sim-to-real research.

Applications beyond legged robots

Autonomous vehicles

The autonomous vehicle industry relies heavily on simulation. Companies such as Waymo conduct billions of virtual kilometers of driving before deploying on real roads. The simulation provides a way to encounter rare but dangerous scenarios (pedestrian jaywalking, sensor failures, unusual weather) that would be impractical to collect in real driving data. Domain randomization is used to vary weather conditions, traffic patterns, and sensor characteristics.

Drones and UAVs

Sim-to-real transfer for unmanned aerial vehicles uses platforms such as AirSim (built on Unreal Engine by Microsoft) and custom simulators. Challenges specific to aerial robots include aerodynamic effects (ground effect, turbulence), wind disturbance, and the need for very low-latency control. Domain randomization over wind direction, magnitude, and flight conditions has been shown to help agents learn general policies that transfer to physical drones. RL-based flight controllers trained in simulation have been successfully deployed on fixed-wing aircraft and demonstrated superior performance compared to commercial flight controllers in some tests.

Surgical robotics

Sim-to-real transfer has been applied to robot-assisted surgery, where physical experiments on real tissue are limited by ethical and practical constraints. Researchers have trained visual reinforcement learning policies for tasks such as suture knot-tying in simulation and transferred them to real surgical robots. The challenge of simulating deformable tissue contact dynamics makes this domain particularly difficult.

Industrial manipulation

Factory environments benefit from sim-to-real because they involve repetitive tasks where a small improvement in automation yields large economic gains. NVIDIA's AutoMate system demonstrated a mean success rate of 84.5 percent for real-world assembly tasks, with policies trained primarily in simulation. The R2D2 project combines simulation with language models to improve robotic manipulation capabilities.

Challenges and open problems

Despite significant progress, several fundamental challenges remain.

Contact dynamics fidelity. Contact simulation remains one of the weakest points of current physics engines. Real-world contact involves complex phenomena (microslip, elastic deformation, surface roughness) that are computationally expensive to simulate accurately. This is especially problematic for contact-rich tasks like assembly, tool use, and manipulation of deformable or fragile objects.

Deformable and soft objects. Manipulating cloth, rope, food, and other deformable materials remains extremely difficult to transfer from simulation to reality. The computational demands of soft-body simulation limit the scale of parallel training, and the parameter space of deformable object properties is vast.

Visual fidelity. While rendering quality has improved substantially, traditional simulators still struggle to reproduce real-world lighting, reflections on glossy surfaces, transparency, and fine textures. Photorealistic rendering (ray tracing) helps but is computationally expensive and slows training.

Physics exploitation. Agents sometimes learn to exploit artifacts of the physics engine, discovering "cheats" that work in simulation but have no real-world equivalent. For example, a locomotion policy might learn to gain energy from numerical integration errors, or a manipulation policy might exploit unrealistic friction models. These failure modes are difficult to detect until deployment.

Scalability of system identification. While automated SysID methods are improving, they still require real-world data collection, which limits scalability. The trade-off between making the simulator more accurate (SysID) and making the policy more robust (domain randomization) remains an active area of research.^[17]

Long-horizon tasks. Most successful sim-to-real demonstrations involve relatively short tasks (grasping, stepping, pushing). Longer task horizons compound small errors at each step, making transfer progressively harder. Hierarchical approaches and task decomposition show promise but introduce their own transfer challenges.

Benchmark standardization. The field lacks standardized benchmarks for measuring sim-to-real transfer quality. Performance is typically reported on specific robot-task combinations, making it difficult to compare methods across labs. Recent efforts such as the RoboVerse platform, which provides a unified API across multiple simulators, aim to address this.

How is sim-to-real transfer measured?

Researchers assess sim-to-real transfer along several dimensions:

Metric	What it measures
Zero-shot success rate	Task completion on real hardware without any real-world fine-tuning
Generalization error	Performance difference between simulation and reality (often measured by MSE or success rate drop)
Robustness	Ability to maintain performance under varied real-world conditions (different objects, lighting, disturbances)
Sample efficiency	Amount of simulation data and real data needed to achieve a given performance level
Computational cost	Training time, GPU hours, and simulation throughput required
Training stability	Consistency of results across random seeds and randomization distributions

Timeline of key developments

Year	Development
2012	Todorov, Erez, and Tassa publish the MuJoCo physics engine paper
2016	Sadeghi and Levine demonstrate collision-free flight by training only in simulation (CAD2RL)
2017	Tobin et al. introduce domain randomization for object detection transfer (IROS)
2018	Peng et al. demonstrate dynamics randomization for robotic control transfer (ICRA); Google's QT-Opt scales RL grasping to 580K real attempts
2018	OpenAI trains a Shadow Hand to rotate a block using sim-to-real transfer
2019	OpenAI solves the Rubik's Cube with a robot hand using Automatic Domain Randomization
2020	Lee et al. (ETH Zurich) demonstrate ANYmal locomotion over challenging natural terrain via teacher-student distillation
2021	Google DeepMind acquires MuJoCo; Hofer et al. publish "Sim2Real in Robotics and Automation" survey
2022	MuJoCo released as open source; parkour-style locomotion demonstrated on quadrupeds
2024	TRANSIC introduces human-in-the-loop corrections for sim-to-real furniture assembly
2025	NVIDIA releases Isaac GR00T N1 and Isaac Sim 5.0 as open source; Newton physics engine announced at GTC; Genesis claims 10-80x speedup over Isaac Gym; Google DeepMind launches Gemini Robotics
2025	Real-is-Sim introduces dynamic digital twins with 60 Hz real-world synchronization; Unified Locomotion Transformer reduces sim-to-real data requirements for quadrupeds

References

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World." *IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)*. arXiv:1703.06907. ↩
Peng, X. B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). "Sim-to-Real Transfer of Robotic Control with Dynamics Randomization." *IEEE International Conference on Robotics and Automation (ICRA)*. ↩
OpenAI, Akkaya, I., et al. (2019). "Solving Rubik's Cube with a Robot Hand." *arXiv preprint arXiv:1910.07113*; OpenAI, "Solving Rubik's Cube with a robot hand" (blog), October 2019. ↩
Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., & Hutter, M. (2020). "Learning Quadrupedal Locomotion over Challenging Terrain." *Science Robotics*, 5(47). ↩
Kalashnikov, D., et al. (2018). "QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation." *Conference on Robot Learning (CoRL)*. ↩
Zhao, W., Queralta, J. P., & Westerlund, T. (2020). "Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey." *IEEE Symposium Series on Computational Intelligence (SSCI)*. ↩
Hofer, S., et al. (2021). "Sim2Real in Robotics and Automation: Applications and Challenges." *IEEE Transactions on Automation Science and Engineering*, 18(2), 398-400. ↩
Todorov, E., Erez, T., & Tassa, Y. (2012). "MuJoCo: A Physics Engine for Model-Based Control." *IEEE/RSJ International Conference on Intelligent Robots and Systems*. ↩
Weng, L. (2019). "Domain Randomization for Sim2Real Transfer." Lil'Log. ↩
NVIDIA. (2025). "NVIDIA Announces Isaac GR00T N1: the World's First Open Humanoid Robot Foundation Model." NVIDIA Newsroom. ↩
Linux Foundation. (2025). "Linux Foundation Announces Contribution of Newton by Disney Research, Google DeepMind and NVIDIA." ↩
NVIDIA. (2025). "Building Generalist Humanoid Capabilities with NVIDIA Isaac GR00T N1.6 Using a Sim-to-Real Workflow." NVIDIA Technical Blog. ↩
NVIDIA. (2025). "Announcing Newton, an Open-Source Physics Engine for Robotics Simulation." NVIDIA Technical Blog. ↩
Jiang, Y., et al. (2024). "TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction." *arXiv preprint arXiv:2405.10315*.
Rusu, A. A., et al. (2017). "Sim-to-Real Robot Learning from Pixels with Progressive Nets." *Conference on Robot Learning (CoRL)*. ↩
James, S., Wohlhart, P., Kalakrishnan, M., et al. (2019). "Sim-to-Real via Sim-to-Sim: Data-Efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks." *CVPR*. ↩
Chen, T., et al. (2023). "Understanding Domain Randomization for Sim-to-Real Transfer." *Transactions on Machine Learning Research (TMLR)*. ↩
Muratore, F., et al. (2022). "DROPO: Sim-to-Real Transfer with Offline Domain Randomization." *Robotics and Autonomous Systems*. ↩
Genesis-Embodied-AI. (2024). "Genesis: A Generative and Universal Physics Engine for Robotics and Embodied AI." ↩
Haosu Lab. (2024). "ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI." *arXiv preprint arXiv:2410.00425*. ↩
Google DeepMind. (2025). "Gemini Robotics brings AI into the physical world." Google DeepMind Blog, March 12, 2025. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

6 revisions by 1 contributors · full history

Suggest edit

Sim-to-real transfer

Why train robots in simulation?

What is the reality gap?

Core techniques

Domain randomization

Automatic domain randomization (ADR)

Bayesian domain randomization

Domain adaptation

System identification

Teacher-student distillation

Curriculum learning and reward shaping

Real-to-sim-to-real and digital twins

Which simulators are used for sim-to-real transfer?

MuJoCo

NVIDIA Isaac Sim and Isaac Lab

Newton

Genesis

SAPIEN and ManiSkill

Notable successes

OpenAI Rubik's Cube (2019)

ANYmal quadruped locomotion (2020)

Google QT-Opt (2018)

Locomotion on Unitree robots

NVIDIA Isaac GR00T and humanoid robots

Foundation models and sim-to-real

Applications beyond legged robots

Autonomous vehicles

Drones and UAVs

Surgical robotics

Industrial manipulation

Challenges and open problems

How is sim-to-real transfer measured?

Timeline of key developments

See also

References

Improve this article

What links here (24 of 41)

What links here (24 of 41)

Why train robots in simulation?

What is the reality gap?

Core techniques

Domain randomization

Automatic domain randomization (ADR)

Bayesian domain randomization

Domain adaptation

System identification

Teacher-student distillation

Curriculum learning and reward shaping

Real-to-sim-to-real and digital twins

Which simulators are used for sim-to-real transfer?

MuJoCo

NVIDIA Isaac Sim and Isaac Lab

Newton

Genesis

SAPIEN and ManiSkill

Notable successes

OpenAI Rubik's Cube (2019)

ANYmal quadruped locomotion (2020)

Google QT-Opt (2018)

Locomotion on Unitree robots

NVIDIA Isaac GR00T and humanoid robots

Foundation models and sim-to-real

Applications beyond legged robots

Autonomous vehicles

Drones and UAVs

Surgical robotics

Industrial manipulation

Challenges and open problems

How is sim-to-real transfer measured?

Timeline of key developments

See also

References

Improve this article

Related Articles

Embodied AI

Robot learning

Imitation Learning

MuJoCo

NVIDIA Isaac Lab

Control theory

What links here (24 of 41)

Related Articles

Embodied AI

Robot learning

Imitation Learning

MuJoCo

NVIDIA Isaac Lab

Control theory

What links here (24 of 41)