NVIDIA Isaac GR00T N1

AI Models Humanoid Robots Open Source AI Robotics

24 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

20 citations

Revision

v2 · 4,739 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

NVIDIA Isaac GR00T N1 is an open foundation model for humanoid robots developed by NVIDIA and unveiled by Jensen Huang on March 18, 2025 at the company's annual GTC conference in San Jose, California. NVIDIA describes it as "the world's first open, fully customizable foundation model for generalized humanoid reasoning and skills," and the first of a family of pretrained checkpoints that the company will release to the wider robotics community ^[1]^[2]. The base GR00T N1 model has roughly 2 billion parameters and uses a dual-system architecture inspired by Daniel Kahneman's split between fast and slow thinking, pairing a Vision-Language-Action (VLA) backbone (System 2) with a flow-matching diffusion transformer (System 1) that produces continuous-value motor actions ^[3]^[4]^[20].

GR00T N1 was released alongside its training data and benchmark tasks on Hugging Face and GitHub on March 17 to 18, 2025, and the project has since received four major updates: N1.5 in June 2025, N1.6 in September 2025, and N1.7 in April 2026. The successive versions changed the vision-language model backbone twice (from Eagle 2 to Eagle 2.5 to NVIDIA's own Cosmos-Reason variants), doubled the size of the action transformer, replaced absolute joint targets with state-relative action chunks, and expanded training data from a few thousand hours of teleoperation to more than 20,000 hours of human egocentric video ^[5]^[6]^[7]. The model weights are distributed under the NVIDIA Open Model License Agreement, which permits commercial use with attribution, and the surrounding code is licensed under Apache 2.0 ^[8].

Quick facts

Attribute	Detail
Full name	NVIDIA Isaac GR00T N1 (GR00T = Generalist Robot 00 Technology)
Developer	NVIDIA (GEAR research lab)
Announced	March 18, 2025, at GTC by Jensen Huang ^[1]
Type	Open Vision-Language-Action (VLA) foundation model for humanoid robots ^[3]
Base size	~2 billion parameters ^[20]
Architecture	Dual-system: System 2 vision-language reasoning + System 1 flow-matching diffusion transformer ^[3]^[4]
Cross-embodiment	One checkpoint fine-tunable across humanoids and robot arms (Fourier GR-1, Franka Panda, others) ^[3]^[8]
Training data	Real teleoperation, human egocentric video, and Isaac Lab synthetic data ^[3]
License	NVIDIA Open Model License (weights); Apache 2.0 (code) ^[8]
Latest version	GR00T N1.7 (April 2026) ^[7]

What is GR00T N1?

GR00T N1 is NVIDIA's open, downloadable brain for humanoid robots: a single pretrained model that turns camera images and plain-language instructions into continuous motor commands, and that can be fine-tuned to drive many different robot bodies rather than one. NVIDIA frames it as a generalist policy that can pick objects up, put them down, transfer items between hands, follow language instructions, and chain those primitives into longer multi-step tasks across use cases such as material handling, packaging, and inspection ^[1]^[2].

The name GR00T is an acronym for Generalist Robot 00 Technology, and the "N1" denotes the first numbered model release in the line. NVIDIA's stated purpose for releasing the weights openly is to let the many humanoid hardware makers focus on robots while sharing a common foundation model and simulation tooling. The company pitched the release against a backdrop of what it called a global labor shortage "estimated at more than 50 million people" ^[1].

ELI5

Imagine a robot that needs to learn how to do chores. Normally you would have to program every single move by hand. GR00T N1 is like a starter brain you can download for free: it has already practiced watching people and doing tasks in a video-game-like simulator, so it already knows the basics of seeing things, understanding what you ask, and moving its hands. A company that builds a robot body can take this starter brain, show it a few dozen examples of a new chore, and the robot picks it up quickly instead of starting from zero.

What did NVIDIA and Jensen Huang say about GR00T N1?

At GTC 2025, Jensen Huang tied the launch to a broader claim that general-purpose robotics had arrived. "The age of generalist robotics is here," Huang said in the announcement. "With NVIDIA Isaac GR00T N1 and new data-generation and robot-learning frameworks, robotics developers everywhere will open the next frontier in the age of AI" ^[1].

NVIDIA's official framing of the model is equally direct: it calls GR00T N1 "the world's first open, fully customizable foundation model for generalized humanoid reasoning and skills" and "the first of a family of fully customizable models that NVIDIA will pretrain and release to worldwide robotics developers" ^[1]. NVIDIA has consistently positioned itself as a horizontal platform supplier to the humanoid industry rather than a robot maker, a stance Huang has repeated across subsequent keynotes.

Background: how did Project GR00T lead to N1?

NVIDIA's interest in humanoid robotics predates GR00T N1 by exactly one year. At GTC 2024 on March 18, 2024, Jensen Huang devoted a portion of his keynote to a project called GR00T, an acronym for Generalist Robot 00 Technology. The initial framing was that humanoids were the most exciting open problem in AI and that NVIDIA would attempt to play the role of a horizontal platform supplier across competing humanoid programs rather than building its own robot. The 2024 announcement named 1X Neo maker 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Figure AI, Fourier Intelligence, Sanctuary AI, Unitree Robotics, and XPENG Robotics as early collaborators on the project ^[9].

The 2024 announcement was vague on what GR00T actually contained, and IEEE Spectrum's coverage by Evan Ackerman noted that the demonstrations were mostly aspirational and that fundamental questions, including whether the foundation model was trained on real robot data or on simulation, had not been answered ^[10]. Over the following year NVIDIA filled in the gap with a string of smaller releases on the Isaac platform: Isaac Lab for parallel reinforcement learning, OSMO for compute orchestration, Isaac Manipulator and Isaac Perceptor for robot arms and mobile robots, and the GR00T-Mimic synthetic data blueprint. By the time Huang took the stage at GTC 2025 in March, the company had a real model to put behind the GR00T name.

The broader strategic context is also worth keeping in mind. By early 2025 the humanoid race had crowded considerably. Tesla had begun shipping Optimus prototypes inside its factories, Figure 03 and Apptronik Apollo were running pilot deployments at BMW and Mercedes, and Chinese makers such as Unitree, XPENG, and Fourier had pushed prices for entry-level humanoids below 20,000 USD. NVIDIA's pitch with GR00T N1 was that none of these companies wanted to build a generalist brain in-house and that a shared foundation model, trained on data from many embodiments and shipped with simulation tooling, would let them focus on hardware and on the last mile of task fine-tuning ^[2].

How does the dual-system architecture work?

GR00T N1 is structured as a Vision-Language-Action model with two coupled networks that are jointly trained end-to-end. The vision-language module, labeled System 2 in NVIDIA's documentation, interprets camera images and natural-language instructions and produces tokens that summarize scene understanding and high-level intent. The diffusion module, labeled System 1, takes those tokens and the robot's current proprioceptive state and denoises them into continuous motor commands. The labels are a deliberate reference to Daniel Kahneman's Thinking, Fast and Slow, with System 2 doing deliberate reasoning and System 1 producing reflexive motion ^[3]. In NVIDIA's words, "the vision-language module (System 2) interprets the environment through vision and language instructions" while "the subsequent diffusion transformer module (System 1) generates fluid motor actions in real time" ^[3].

Vision-language backbone (System 2)

In the original N1 release, the vision-language module is built on Eagle 2, an open vision-language model that uses a SigLIP-2 image encoder and a small language encoder; NVIDIA's developer materials describe the System 2 backbone as NVIDIA-Eagle paired with a roughly 1.7-billion-parameter SmolLM-class language model ^[3]^[4]^[20]. Images and language are encoded into a shared token sequence and run through the VLM transformer to produce per-token embeddings that act as conditioning signals for the action head ^[3]^[4]. The proprioceptive state of the robot, including joint positions, joint velocities, and end-effector poses, is encoded by a separate multilayer perceptron whose weights are indexed by an embodiment tag, so one model can serve different robot bodies with their own joint counts and limb layouts.

Action head (System 1)

The action head is a diffusion transformer trained with flow matching rather than the more common DDPM objective. During training, ground-truth action chunks are corrupted by random interpolation between the clean action and Gaussian noise; at inference, the model starts from Gaussian noise and uses a velocity prediction objective to denoise the noise into a clean trajectory. The transformer interleaves self-attention over proprioception and action tokens with cross-attention to the vision-language embeddings, and diffusion step conditioning is handled through adaptive layer normalization, a common pattern in DiT-style image generators ^[4]. The output is a sequence of action vectors mapped to the relevant robot's degrees of freedom, decoded by another per-embodiment MLP.

The design borrows several ideas from earlier VLA models, most notably Physical Intelligence's π0.5 and ALOHA-style imitation learning systems built around action chunking. What is distinct about GR00T N1 is that the entire stack is open and that NVIDIA explicitly designed it for cross-embodiment use: a single checkpoint can be fine-tuned on a Fourier GR-1 humanoid, a Franka Panda arm, a Galaxea R1 mobile manipulator, or a WidowX research arm without changing the core model code ^[3]^[8].

How was GR00T N1 trained?

The data strategy for GR00T N1 is what NVIDIA calls a heterogeneous mixture. The model is trained on three classes of trajectories: real robot teleoperation collected on Fourier GR-1 humanoids and other partner platforms; egocentric video of humans performing manipulation tasks; and synthetic robot data generated in Isaac Lab with the GR00T-Mimic and GR00T-Dreams blueprints ^[3]. NVIDIA reports that more than 750,000 synthetic trajectories were produced in just 11 hours of GPU time using GR00T-Mimic, which the company says is equivalent to about 6,500 hours, or nine continuous months, of human demonstration data, and that combining this synthetic data with real data delivered a 40 percent performance boost for GR00T N1 compared with using real data alone ^[1]^[2]^[20].

GR00T-Mimic, released alongside N1 in March 2025, is a teleoperation amplifier. A human operator wearing an Apple Vision Pro headset teleoperates a simulated robot in NVIDIA Isaac Lab, and GR00T-Mimic takes a small number of those demonstrations and procedurally generates many more variants by randomizing scene geometry, lighting, friction, and object placements. GR00T-Dreams, announced two months later at Computex 2025, goes further: it uses the NVIDIA Cosmos family of world foundation models, specifically Cosmos Predict and Cosmos Reason, to generate completely novel manipulation trajectories from a single image and language prompt rather than augmenting an existing demonstration ^[11]^[12].

The full pretraining run for the first N1 checkpoint was carried out on H100 GPUs. NVIDIA has not published the exact compute budget for the original N1 release, but the more detailed N1.5 model card lists 250,000 steps on roughly 1,000 H100s with a global batch size of 16,384 tokens, which gives a rough sense of the scale at which subsequent versions have been trained ^[5]. Embodiment-specific post-training is typically much cheaper, on the order of 10,000 to 30,000 steps with a smaller batch, and NVIDIA recommends starting from as few as 20 to 40 demonstrations of a new task.

What are the different GR00T N1 versions?

GR00T N1 has gone through four numbered releases since launch. Each one swapped or upgraded major architectural components while keeping the embodiment tags and dataset format stable so existing post-training pipelines kept working.

Version	Release date	Headline changes
GR00T N1	March 17 to 18, 2025 (GTC)	Initial 2B-parameter VLA on Eagle 2 VLM and a 16-layer flow-matching DiT, trained on GR-1 teleop, human egocentric video, and Isaac Lab synthetic data ^[1]^[3]
GR00T N1.5	June 11, 2025	VLM upgraded to Eagle 2.5 (2.1B parameters) and frozen during training, simplified MLP adapter with layer norm, new Future Latent Representation Alignment (FLARE) objective added to flow matching, large jumps in language following on real GR-1 (46.6 to 93.3 percent) ^[5]^[13]
GR00T N1.6	September 29, 2025	VLM replaced with an internal Cosmos-Reason-2B variant supporting native aspect ratios, DiT doubled in size to 32 layers, action space switched to state-relative action chunks for most embodiments, top 4 VLM layers unfrozen during pretraining ^[6]^[14]
GR00T N1.7	April 17, 2026 (Early Access)	VLM upgraded to Cosmos-Reason2-2B, EgoScale pretraining on 20,854 hours of human egocentric video across 20+ task categories, relative end-effector action space shared between humans and robots, support for 22 degree-of-freedom dexterous hands ^[7]^[8]

The N1.5 update is interesting because of how it was made rather than what it added. NVIDIA Research reported that the entire N1.5 development cycle, including new data generation, took about 36 hours of wall-clock time using the GR00T-Dreams blueprint. The same work, the team argued, would have taken close to three months with manual teleoperation collection. That figure became one of the central talking points at Computex 2025 and has been repeated by Huang in several subsequent keynotes as evidence that synthetic data has reached a tipping point in robotics ^[11]^[12].

N1.6 is the first version that does not use any third-party VLM at the front of the stack. NVIDIA's Cosmos-Reason model family was originally released for video reasoning and scene description, but a 2-billion parameter variant of Cosmos-Reason was repurposed as the System 2 component in N1.6, which let the team support flexible image resolutions natively and unfreeze the top layers of the VLM during pretraining without destabilizing training ^[6]. The action head was doubled in depth and switched to predicting state-relative chunks rather than absolute joint targets, which the team says produces less jittery motion and adapts better to imperfect starting positions.

N1.7 then pushed in two directions: deeper reasoning by upgrading the VLM to Cosmos-Reason2-2B, and dramatically more pretraining data by adding 20,854 hours of human egocentric video. NVIDIA calls this EgoScale pretraining, and the accompanying technical blog reports what the team describes as the first scaling law for robot dexterity: average task completion rates rise approximately linearly when the egocentric video budget goes from one thousand to twenty thousand hours, with a roughly 2x improvement across that range ^[7]. N1.7 also introduces a relative end-effector action space that is shared between human videos and robot bodies, which is what makes large-scale egocentric video usable as training data in the first place.

What can GR00T N1 do, and how well does it benchmark?

NVIDIA positions GR00T N1 and its successors as generalist robot models that can pick objects up, put them down, transfer items between hands, follow language instructions, and chain those primitives into longer multi-step tasks. The reference scenarios in the launch materials are warehouse-style: bin picking, sorting, packaging, kitting, and inspection. The internal benchmarks the company reports against include RoboCasa (24 simulated mobile manipulation tasks), Digital Cousin GR-1 (24 GR-1 humanoid manipulation tasks), Language Table, DexMG (dexterous manipulation), and DreamGen (12 new manipulation verbs introduced specifically to stress generalization) ^[3]^[5].

The original N1 paper reported that GR00T N1 outperformed several state-of-the-art imitation learning baselines, including ACT, Diffusion Policy, and Open-VLA, on these benchmarks when fine-tuned with comparable amounts of data, and that the model showed reasonable zero-shot transfer to embodiments it had seen during pretraining ^[3]. With N1.5, NVIDIA reported that the success rate on Language Table climbed from 52.8 percent to 93.2 percent, that real GR-1 language following went from 46.6 percent to 93.3 percent, and that RoboCasa improved from 17.4 percent to 47.5 percent with 30 demonstrations. The DreamGen benchmark, which is designed to test new verbs, went from 13.1 percent to 38.3 percent ^[5]^[13].

Benchmark (N1 vs N1.5)	GR00T N1	GR00T N1.5
Real GR-1 language following	46.6%	93.3%
Language Table (simulated)	52.8%	93.2%
RoboCasa (30 demos)	17.4%	47.5%
DreamGen (new verbs)	13.1%	38.3%

Live demos at GTC and Computex tended to feature 1X Neo and Fourier GR-1 humanoids running short manipulation sequences in lightly cluttered environments rather than fully autonomous open-ended tasks. The most-cited public demo from GTC 2025 used a Disney BDX-style robot, similar in form factor to the Star Wars droids that the BDX series is modeled on, walking onto the stage with Huang as a deliberate reference to the project name. The on-stage segment was a tele-controlled demonstration rather than full autonomy, a fact that NVIDIA was upfront about in its press materials ^[1]^[10].

Who are the partners and where is it deployed?

The public partner list has shifted over the four releases. The 2024 Project GR00T launch named nine humanoid programs as collaborators. The March 2025 GR00T N1 announcement listed 1X Technologies, Agility Robotics, Boston Dynamics, Mentee Robotics, and NEURA Robotics as the early-access partners with hands-on integration of the N1 weights ^[1]. By Computex 2025, the list of companies adopting Isaac and GR00T technologies had expanded to include Fourier, Foxlink, Galbot, General Robotics, Skild AI, XPENG Robotics, AeiRobot, and Lightwheel in addition to the original partners. By GTC fall 2025, when N1.6 was announced, NVIDIA was naming Figure AI, Franka Robotics, Hexagon, Solomon, and Techman Robot as additional adopters of Isaac Lab and Cosmos tooling ^[11]^[14].

It is worth distinguishing two kinds of partnership. The first is companies that explicitly use GR00T N1 weights as a starting point for their robots' policies, which from public statements appears to be a smaller and more academic group, currently led by Fourier Intelligence (whose GR-1 humanoid is NVIDIA's main internal test platform), 1X Technologies, Agility Robotics, and Mentee Robotics. The second is companies that use the broader NVIDIA Isaac stack, including Isaac Sim for simulation, Isaac Lab for reinforcement learning training, Newton for physics, and Jetson Thor as the on-robot inference computer, without necessarily building on top of GR00T weights. The second group is much larger and includes most of the major humanoid programs, including Figure AI, whose own Helix (VLA model) is a competing in-house VLA system ^[1]^[11]^[14].

Robot platform	Maker	Relationship to GR00T
Fourier GR-1	Fourier Intelligence	Primary internal test platform; benchmarks for every release run on GR-1 ^[1]^[5]
1X Neo	1X Technologies	Early-access partner since N1; collaborates on Cosmos and Isaac Lab integration ^[1]^[11]
Agility Digit	Agility Robotics	Early-access partner; commercially deployed at GXO warehouses on a separate policy stack ^[1]^[9]
Boston Dynamics Atlas Electric	Boston Dynamics	Project GR00T collaborator since 2024; uses Isaac Sim and Cosmos ^[9]^[14]
Mentee MenteeBot	Mentee Robotics	Early-access partner for N1 ^[1]
NEURA 4NE-1	NEURA Robotics	Early-access partner for N1 ^[1]
Galaxea R1 Pro	Galaxea	Included in N1.6 training mix via BEHAVIOR suite ^[6]
Unitree G1	Unitree Robotics	Included in N1.6 and N1.7 training mix ^[6]^[7]
AGIBot Genie 1	AGIBot	Included in N1.6 and N1.7 training mix ^[6]^[7]
Bimanual YAM	Open hardware	Included in N1.6 and N1.7 training mix ^[6]^[7]
Franka Panda	Franka Robotics	Supported via LIBERO benchmark and DROID dataset checkpoints ^[8]

Figure AI is named as an Isaac Lab adopter but its own VLA work runs on a separate stack. Tesla has not been publicly involved with GR00T at any point, and Apple, Samsung, and the major Korean and Japanese robotics firms have not been listed as partners either, although NVIDIA Jetson hardware shows up in many of those programs.

How does GR00T N1 fit into NVIDIA's robotics stack?

GR00T N1 is one piece of a wider NVIDIA stack for what the company has started calling physical AI. The most important sibling components are Isaac Sim, Isaac Lab, Cosmos, Newton, GR00T-Mimic, GR00T-Dreams, and Jetson Thor.

Isaac Sim is the underlying simulation environment, built on Omniverse and capable of running thousands of simulated robots in parallel on a single GPU node. Isaac Lab is the higher-level reinforcement learning and imitation learning framework that uses Isaac Sim as its physics backend, and Isaac Lab 2.3 added a dexterous grasping workflow specifically aimed at humanoid hands and a policy evaluation framework called Isaac Lab Arena ^[14]. Newton, announced jointly by NVIDIA, Google DeepMind, and Disney Research at GTC 2025, is an open-source GPU-accelerated physics engine designed to be more accurate than the older MuJoCo and PhysX simulators on contact-rich manipulation. NVIDIA also released MuJoCo-Warp around the same time, claiming a 70x speedup on robotics machine learning workloads compared with the reference CPU MuJoCo implementation ^[1]^[14].

GR00T-Mimic and GR00T-Dreams are the two synthetic data blueprints discussed earlier. GR00T-Mimic amplifies a small number of human teleoperation demonstrations into a much larger synthetic set inside Isaac Lab; GR00T-Dreams generates entirely new trajectories from a single image and a text prompt using Cosmos world models. Cosmos itself is a family of NVIDIA foundation models for physical AI, separate from GR00T, that includes Cosmos Predict (a world model that predicts future video frames from past frames), Cosmos Reason (a reasoning VLM for scene description and synthetic data curation), and Cosmos Transfer (a sim-to-real photorealism model). Cosmos Predict 2.5 and Cosmos Transfer 2.5 were released alongside N1.6, with claims of longer 30-second video horizons and a 3.5x smaller Transfer model ^[12]^[14].

On the inference side, Jetson AGX Thor is the on-robot computer that NVIDIA sells for running the complete GR00T stack at runtime. Thor is a Blackwell-class chip in a small form factor, and NVIDIA recommends running the System 1 diffusion head locally on Thor while optionally offloading System 2 reasoning to a nearby cloud GPU when latency budgets allow. The base GR00T N1.7 model supports NVIDIA Ampere, Hopper, Lovelace, Blackwell, and Jetson hardware, which covers basically the full generational range that NVIDIA currently sells ^[7]^[11].

Is GR00T N1 open source?

GR00T N1 was the first major robot foundation model to be released under what is effectively a commercial open license. The model weights ship under the NVIDIA Open Model License Agreement, which permits commercial use with attribution and a small set of restrictions around model identification and acceptable use. The surrounding training, inference, and fine-tuning code in the NVIDIA/Isaac-GR00T GitHub repository is licensed under Apache 2.0 ^[8].

The one wrinkle is that the original GR00T-N1-2B model card on Hugging Face was initially published under NVIDIA's older non-commercial license before being relicensed to the Open Model License Agreement. Later checkpoints, including N1.5, N1.6, and N1.7, were published under the Open Model License from day one. NVIDIA also publishes evaluation datasets and synthetic training corpora, most prominently the PhysicalAI-Robotics-GR00T-X-Embodiment-Sim dataset on Hugging Face, under release-specific data licenses that are generally permissive for research and commercial use ^[4]^[8].

The licensing posture is the obvious contrast with the competition. Tesla's Optimus stack is closed; Figure's Helix (VLA model) is closed; Physical Intelligence released checkpoints for some of its earlier models including π0 and π0.5, but its newer policies are gated; and most academic VLAs, including OpenVLA, are open under research licenses that complicate commercial deployment. GR00T is the one mainstream commercial humanoid VLA that anyone can download, fine-tune, and ship in a product, which is much of the reason it has been so widely adopted as a starting point in the field ^[2]^[8].

How was GR00T N1 received?

Reaction to GR00T N1 has been broadly positive but not uncritical. The initial 2024 Project GR00T announcement was widely seen as more aspirational than substantive; IEEE Spectrum and several other outlets pointed out that the demos were largely tele-operated and that the underlying model was not yet public ^[10]. The March 2025 N1 release, with weights, training data, and benchmarks on Hugging Face and GitHub, changed that conversation substantially. The Robot Report, Hackster.io, and most of the major robotics newsletters treated the release as a serious technical contribution and the first concrete evidence that NVIDIA was willing to commit to the platform role it had described a year earlier ^[2]^[12].

Within the academic robot learning community, the response has been more nuanced. The dual-system VLM plus diffusion-head architecture is not new (Physical Intelligence's π0 had used a similar split a few months earlier), and several researchers noted that GR00T N1's benchmark wins were narrow or depended on data mixes that overlapped with the test sets. The successive updates have addressed some of those criticisms. N1.7's EgoScale pretraining and the dexterity scaling law in particular have been treated as one of the more interesting empirical results in robot learning in 2026, since they suggest that adding more human video produces predictable improvements rather than diminishing returns ^[7]^[15].

The deeper structural reaction has been about NVIDIA's positioning. The company is simultaneously the dominant supplier of training hardware (H100 and Blackwell GPUs), the dominant supplier of inference hardware for robots (Jetson Thor), the publisher of the leading open simulation stack (Isaac), and now the publisher of the leading open foundation model. That vertical position worries some observers, who note that even competitors who would prefer not to standardize on NVIDIA tooling have very limited alternatives, especially for the simulation and synthetic data half of the stack. Others view it as a useful counterweight to closed efforts by Tesla, Figure, and Physical Intelligence, especially given that GR00T weights are genuinely downloadable rather than just "open" in the more diluted sense some other large companies use ^[2]^[11].

The model has also become a common starting point in research papers. By early 2026 dozens of arXiv submissions cited GR00T N1 or N1.5 as a base, and several follow-on systems including SmolVLA and various open-source clones from university labs were explicitly framed as smaller or specialized variants. Whether GR00T N1 ends up being remembered as the BERT moment for humanoid robotics or just as a useful intermediate step depends mainly on whether the deployment claims attached to it, particularly around warehouse and manufacturing labor, hold up in production over the next several years.

References

NVIDIA Newsroom. "NVIDIA Announces Isaac GR00T N1, the World's First Open Humanoid Robot Foundation Model, and Simulation Frameworks to Speed Robot Development." March 18, 2025. https://nvidianews.nvidia.com/news/nvidia-isaac-gr00t-n1-open-humanoid-robot-foundation-model-simulation-frameworks ↩
Edge AI and Vision Alliance. "NVIDIA Announces Isaac GR00T N1." March 2025. https://www.edge-ai-vision.com/2025/03/nvidia-announces-isaac-gr00t-n1-the-worlds-first-open-humanoid-robot-foundation-model-and-simulation-frameworks-to-speed-robot-development/ ↩
NVIDIA GEAR Team. "GR00T N1: An Open Foundation Model for Generalist Humanoid Robots." arXiv:2503.14734, March 18, 2025. https://arxiv.org/abs/2503.14734 ↩
NVIDIA. "GR00T-N1-2B model card." Hugging Face. https://huggingface.co/nvidia/GR00T-N1-2B ↩
NVIDIA Research GEAR Lab. "GR00T N1.5." Research website, June 11, 2025. https://research.nvidia.com/labs/gear/gr00t-n1_5/ ↩
NVIDIA Research GEAR Lab. "GR00T N1.6." Research website, December 2025. https://research.nvidia.com/labs/gear/gr00t-n1_6/ ↩
NVIDIA. "NVIDIA Isaac GR00T N1.7: Open Reasoning VLA Model for Humanoid Robots." Hugging Face blog, April 17, 2026. https://huggingface.co/blog/nvidia/gr00t-n1-7 ↩
NVIDIA. "Isaac-GR00T GitHub repository." https://github.com/NVIDIA/Isaac-GR00T ↩
NVIDIA Newsroom. "NVIDIA Announces Project GR00T Foundation Model for Humanoid Robots and Major Isaac Robotics Platform Update." March 18, 2024. https://nvidianews.nvidia.com/news/foundation-model-isaac-robotics-platform ↩
Evan Ackerman. "Nvidia Announces GR00T, a Foundation Model for Humanoids." IEEE Spectrum, March 18, 2024. https://spectrum.ieee.org/nvidia-gr00t-ros ↩
NVIDIA Newsroom. "NVIDIA Powers Humanoid Robot Industry With Cloud-to-Robot Computing Platforms for Physical AI." Computex 2025, May 18, 2025. https://nvidianews.nvidia.com/news/nvidia-powers-humanoid-robot-industry-with-cloud-to-robot-computing-platforms-for-physical-ai ↩
NVIDIA Developer Blog. "Enhance Robot Learning with Synthetic Trajectory Data Generated by World Foundation Models." 2025. https://developer.nvidia.com/blog/enhance-robot-learning-with-synthetic-trajectory-data-generated-by-world-foundation-models/ ↩
NVIDIA. "GR00T-N1.5-3B model card." Hugging Face, June 2025. https://huggingface.co/nvidia/GR00T-N1.5-3B ↩
NVIDIA Newsroom. "NVIDIA Accelerates Robotics Research and Development With New Open Models and Simulation Libraries." September 29, 2025. https://nvidianews.nvidia.com/news/nvidia-accelerates-robotics-research-and-development-with-new-open-models-and-simulation-libraries ↩
The Robot Report. "NVIDIA Releases Cloud-to-Robot Computing Platforms for Physical AI, Humanoid Development." May 2025. https://www.therobotreport.com/nvidia-cloud-robot-computing-platforms-physical-ai-humanoid-development/ ↩
NVIDIA Developer Blog. "Building Generalist Humanoid Capabilities with NVIDIA Isaac GR00T N1.6 Using a Sim-to-Real Workflow." 2025. https://developer.nvidia.com/blog/building-generalist-humanoid-capabilities-with-nvidia-isaac-gr00t-n1-6-using-a-sim-to-real-workflow/
NVIDIA Developer. "Isaac GR00T - Generalist Robot 00 Technology." Developer portal. https://developer.nvidia.com/isaac/gr00t
Hackster.io. "NVIDIA Isaac GROOT N1 Is an Open Source Foundation Model for Accelerated Humanoid Robot Development." March 2025. https://www.hackster.io/news/nvidia-isaac-groot-n1-is-an-open-source-foundation-model-for-accelerated-humanoid-robot-development-effa04c90231
Hackster.io. "NVIDIA's Robots Dream of Trajectories, Not Electric Sheep, with GR00T-Dreams." 2025. https://www.hackster.io/news/nvidia-s-robots-dream-of-trajectories-not-electric-sheep-with-gr00t-dreams-1f12db16c80f
NVIDIA Developer Blog. "Accelerate Generalist Humanoid Robot Development with NVIDIA Isaac GR00T N1." March 2025. https://developer.nvidia.com/blog/accelerate-generalist-humanoid-robot-development-with-nvidia-isaac-gr00t-n1/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Atlas (robot)Mobile ALOHA NVIDIA Cosmos Reason Nvidia Vision-language-action model π₀ (pi-zero)

Quick facts

What is GR00T N1?

ELI5

What did NVIDIA and Jensen Huang say about GR00T N1?

Background: how did Project GR00T lead to N1?

How does the dual-system architecture work?

Vision-language backbone (System 2)

Action head (System 1)

How was GR00T N1 trained?

What are the different GR00T N1 versions?

What can GR00T N1 do, and how well does it benchmark?

Who are the partners and where is it deployed?

How does GR00T N1 fit into NVIDIA's robotics stack?

Is GR00T N1 open source?

How was GR00T N1 received?

See also

References

Improve this article

Related Articles

Helix (VLA model)

Isaac GR00T N1.5

SmolVLA

OpenVLA

V-JEPA 2

OpenPI

What links here

Related Articles

Helix (VLA model)

Isaac GR00T N1.5

SmolVLA

OpenVLA

V-JEPA 2

OpenPI

What links here