Helix (Figure AI)
Last reviewed
Jun 4, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,815 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 4, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,815 words
Add missing citations, update stale details, or suggest a clearer explanation.
Helix is a Vision-Language-Action (VLA) model for generalist humanoid control, developed in-house by Figure AI, the Sunnyvale, California humanoid-robot company founded by Brett Adcock. Announced on February 20, 2025, Helix is a single neural network that maps onboard camera images and natural-language instructions directly to continuous motor commands, letting Figure's robots perform tasks they were never explicitly trained for. Figure unveiled it just over two weeks after publicly ending its collaboration with OpenAI, positioning Helix as proof that the company could build competitive embodied AI entirely on its own. Figure describes Helix as the first VLA to output high-rate continuous control of the entire humanoid upper body, the first to run two robots from one set of weights, and the first to run fully onboard low-power embedded GPUs. (This Helix is Figure's robot-control model and is unrelated to other products that share the name, such as the consumer-genomics company Helix or Salesforce's older Helix data product.)
In February 2024 Figure raised a $675 million Series B at a $2.6 billion valuation and simultaneously signed a collaboration agreement with OpenAI to build next-generation AI models for its humanoids, with the OpenAI Startup Fund among the investors. The partnership lasted less than a year. On February 4, 2025, Adcock announced he had decided to leave the agreement, writing that Figure had "made a major breakthrough on fully end-to-end robot AI, built entirely in-house" and promising to show "something no one has ever seen on a humanoid" within 30 days. Helix arrived 16 days later.
Adcock later expanded on the split. In remarks reported on March 31, 2026, he said the OpenAI collaboration delivered "very little" technical value, that he struggled to get OpenAI's team engaged in the hands-on work robotics requires, and that he ended the relationship after OpenAI signaled it would pursue humanoid robots internally. His broader thesis was that solving embodied AI at scale requires vertical integration of the robot AI stack rather than bolting a general-purpose language model onto robot hardware.
Helix uses a dual-system design that Figure compares to the "thinking fast and slow" split between deliberate and reflexive cognition. The two systems are trained together, end to end, but run at very different speeds so the model can reason about a scene while still reacting in real time.
| System | Role | Model | Frequency |
|---|---|---|---|
| System 2 (S2) | Slow semantic "thinking": scene understanding and language comprehension | ~7B-parameter open-source, open-weight internet-pretrained vision-language model | 7-9 Hz |
| System 1 (S1) | Fast reactive visuomotor control | ~80M-parameter cross-attention encoder-decoder transformer | 200 Hz |
System 2 ingests monocular robot images plus robot state and the language goal, then compresses its understanding into a single continuous latent vector. System 1 takes that latent vector as conditioning and produces full upper-body control commands, including desired wrist poses, finger flexion and abduction, and torso and head orientation targets. Because S2 handles the slow generalization problem and S1 handles fast reaction, the same unified weights drive a wide range of behaviors without task-specific fine-tuning. The control action space spans 35 degrees of freedom.
A distinguishing engineering claim is that Helix runs entirely onboard the robot on dual low-power embedded GPUs, rather than streaming inference from a datacenter. Figure presents this as making the model immediately suitable for commercial deployment.
Figure reported that the original Helix was trained on roughly 500 hours of high-quality teleoperated robot behavior, a multi-robot, multi-operator dataset that the company says is a small fraction (under 5%) of the data volume used by some earlier VLA efforts. To attach language to that motion data, Figure used an auto-labeling pipeline in which a vision-language model generated "hindsight" natural-language instructions describing what the robot did, which were then paired with the recorded actions for training.
Figure says Helix is the first VLA to output high-rate continuous control of the full humanoid upper body, coordinating wrists, torso, head, and individual fingers at high degrees of freedom. Robots running Helix can pick up "virtually any small household object," including thousands of items they have never encountered before, when prompted in plain language. In demonstrations, asking a Figure 02 to "put these away" with unfamiliar groceries, or to "pick up the desert item" from a pile of clutter (selecting a toy cactus), elicited correct behavior with no object-specific training.
Helix was, by Figure's account, the first VLA to run simultaneously on two robots, with both sharing a single set of weights to solve a shared, long-horizon manipulation task using items they had not seen before. In the launch demo two Figure 02 units cooperated on a grocery storage task, handing objects to each other and closing drawers and a refrigerator door, coordinated through prompts such as "hand the bag of cookies to the robot on your right." Figure reported that the robots would visually confirm with each other before transferring items.
On February 26, 2025, Figure published work applying Helix to logistics package handling, including transferring packages between conveyor belts and orienting each package so its shipping label faces the scanner. The company reported that stereo vision raised throughput about 60% over a non-stereo baseline, that well-curated demonstration data outperformed larger uncurated datasets, and that a faster "sport mode" could exceed the speed of the human expert demonstrations. The company framed this as evidence that a single policy could transfer across robot platforms with comparable manipulation performance.
On September 18, 2025, Figure announced Project Go-Big, described as a large-scale humanoid pretraining data-collection initiative intended to build one of the largest and most diverse datasets ever used to train humanoid robots. Its centerpiece is a partnership with Brookfield Asset Management, whose real-estate portfolio (Figure cites over 100,000 residential units plus hundreds of millions of square feet of commercial and logistics space) is used to capture human goal-directed behavior across many real-world environments.
A notable result tied to this effort is that Figure trained Helix using 100% egocentric (first-person) human video, with no robot demonstrations for the task, and reported that robots could then navigate cluttered real homes from conversational commands such as "go to the fridge" or "walk to the kitchen table." Figure presented this human-to-robot transfer, where a single Helix network outputs both navigation and manipulation end to end from language and pixels, as a first in humanoid robotics.
On January 27, 2026, Figure announced Helix 02 (also styled Helix-02), which extends control from the upper body to the entire robot so that walking, manipulation, and balance run as one continuous system, what Figure calls full-body autonomy. Helix 02 adds a third, faster tier to the hierarchy: a roughly 10M-parameter System 0 (S0) running at about 1 kHz for balance, contact, and whole-body coordination, beneath the 200 Hz System 1 that turns perception into full-body joint targets and the slower System 2 that reasons about goals.
Figure said Helix 02 was trained on over 1,000 hours of joint-level retargeted human motion data and demonstrated it with a roughly four-minute end-to-end autonomous task that loaded and unloaded a dishwasher, chaining dozens of loco-manipulation actions in sequence. The system uses palm cameras and fingertip tactile sensors able to detect forces as small as three grams, sensing introduced on the newer hardware. A May 2026 demonstration showed two Helix 02 robots tidying a bedroom together, including making a bed, in under two minutes.
Helix is the AI that runs across Figure's humanoid line. Figure 02, introduced in August 2024, has 35 degrees of freedom with five-fingered hands and was the platform used in the original Helix demos. Figure 03, unveiled in October 2025, is designed for the home and adds upgraded cameras, palm cameras, and fingertip tactile sensing that Helix exploits; Figure has said the third-generation robot is meant to learn household tasks by observing humans rather than through explicit programming. Figure has also described scaling Helix alongside its BotQ manufacturing effort, and by September 2025 reported the BotQ line and the AI platform as the two pillars it intended to scale with new capital.
Helix landed during an intense competitive period for humanoid robot foundation models and was widely covered as a credible in-house answer to OpenAI's exit, with outlets noting both the technical ambition and the marketing timing. The dual-system "slow VLM plus fast policy" structure places Helix alongside contemporaneous approaches such as Physical Intelligence's pi-zero, Google DeepMind's Gemini Robotics, and the large behavior model line of work, all part of the broader push toward generalist robot learning policies. As with most humanoid demonstrations, independent analysts have cautioned that polished demo videos do not by themselves establish reliability or autonomy under uncontrolled conditions, and detailed third-party benchmarks of Helix remain limited because the model is proprietary.
Figure's funding trajectory tracked the attention: the company raised over $1 billion in a Series C at a $39 billion post-money valuation announced on September 16, 2025, led by Parkway Venture Capital with participation from Brookfield, NVIDIA, Intel Capital, Qualcomm Ventures, Salesforce, LG Technology Ventures, and others.