| Spirit AI Moz1 | |
|---|---|
![]() | |
| General information | |
| Manufacturer | Spirit AI |
| Country of origin | China |
| Year unveiled | June 2025 |
| Status | In development / early commercial deployment |
| Locomotion | Wheeled (omnidirectional base) |
| Website | spirit-ai.com |
The Moz1 is a wheeled humanoid robot developed by Spirit AI (Chinese: 千寻智能, Qianxun Intelligence), a robotics startup headquartered in Hangzhou, Zhejiang Province, China. Unveiled in June 2025, the Moz1 is described as China's first embodied intelligent robot with high-precision full-body force control. It features 26 degrees of freedom (excluding its dexterous hands), self-developed integrated force-controlled joints with what the company claims is the world's highest power density, and an omnidirectional wheeled base for indoor and industrial navigation. The robot is powered by Spirit AI's proprietary Spirit V1 Vision-Language-Action (VLA) model, which integrates visual perception, natural language understanding, and action generation into a single end-to-end system.[1][2][3]
The Moz1 has been deployed in early commercial settings, including battery pack production lines at CATL's manufacturing facilities and as a barista robot in JD.com retail stores.[4][5] Spirit AI was named the number two startup in Asia on The Information's "50 Most Promising Startups of 2025" list.[6]
Spirit AI was founded on January 16, 2024, by Han Fengtao and Gao Yang, with Zheng Lingyin serving as co-founder and COO.[7][8] The company focuses on developing general-purpose embodied intelligence models and humanoid robots capable of performing physical tasks in real-world environments.
Han Fengtao serves as founder and CEO. He has over a decade of experience in the robotics industry, having previously served as CTO of Rokae Robotics, where he led the development and delivery of nearly 20,000 robot units across more than 20 industries and over 1,000 clients.[7][9]
Gao Yang serves as co-founder and Chief Scientist. He earned his PhD from UC Berkeley, where he studied under Pieter Abbeel, a leading researcher in robot learning. Gao is known for developing the EfficientZero reinforcement learning algorithm (published at NeurIPS 2021) and creating the ViLa and CoPa models for robotic perception and manipulation. He also holds a position as assistant professor at Tsinghua University's Institute for Interdisciplinary Information Sciences.[7][8][10]
Zheng Lingyin serves as co-founder and COO, bringing extensive experience in robotics commercialization, including overseas industrial robot expansion and marketing director roles at Fortune 500 companies.[7]
The company's broader team includes members from UC Berkeley, Carnegie Mellon University, Tsinghua University, and Peking University, as well as professionals from ByteDance, Xiaomi, and Tencent.[7][11]
Spirit AI's stated mission is "Empowering 10% of the world to own their robot within 10 years."[12] The company pursues a full-stack approach to robotics, developing both hardware and software in-house. Co-founder Gao Yang has described this philosophy by saying: "You need to build an Apple, not an Android," reflecting the belief that tight hardware-software integration is essential for achieving reliable embodied AI.[8]
Spirit AI has raised significant venture capital since its founding, achieving unicorn status (valuation exceeding 10 billion yuan) within approximately two years of its establishment.[13][14]
| Round | Date | Amount | Lead investors |
|---|---|---|---|
| Seed | February 2024 | Undisclosed | Shunwei Capital, Oasis Capital |
| Angel | August 2024 | Honghui Fund | |
| Angel+ | November 2024 | Undisclosed | Bairui Capital |
| Pre-A | March 2025 | Prosperity7 Ventures, China Merchants Venture | |
| Pre-A+ | July 2025 | JD.com, China Internet Investment Fund | |
| Series A | February 2026 | Yunfeng Capital, Chaos Investment, HongShan, Synstellation Capital, TCL Capital | |
| Series A+ | March/April 2026 | Shunwei Capital (Lei Jun), Yunfeng Fund (Jack Ma) |
Notable strategic investors include Huawei's Hubble Investment arm (which acquired a 1.43% equity stake), CATL, JD.com, and funds affiliated with Lei Jun and Jack Ma.[13][14][15][16] By April 2026, Spirit AI's total funding exceeded RMB 3 billion (approximately $420 million), and the company's valuation surpassed $1.4 billion.[16]
Spirit AI developed its first-generation robot, the Moz0, within six months of the company's founding. Born in July 2024, the Moz0 served as an initial hardware prototype for testing the company's embodied intelligence approach. The Moz0 validated the feasibility of combining full-body force control with VLA-based learning, laying the groundwork for the more advanced Moz1.[7]
The Moz1 was officially launched in June 2025 as a significant upgrade over the Moz0 prototype. Spirit AI described it as the first embodied intelligent robot in China with whole-body high-precision force control and the first to demonstrate truly multi-task continuous generalization.[2][3] The robot was publicly demonstrated at the 2025 World Robot Conference (WRC) in Beijing, held from August 8 to 12, 2025, where it showcased tasks including folding clothes. According to Xie Junyuan, head of Spirit AI's embodied intelligence division, "While folding clothes may look simple, it actually requires precise, long-range manipulation of soft materials, a capability with broad application prospects."[17]
In December 2025, Spirit AI deployed a variant of the Moz platform called "Xiao Mo" (小莫) at CATL's Zhongzhou battery manufacturing base in Luoyang, Henan Province. This deployment marked the world's first large-scale use of humanoid robots on a power battery pack production line.[4][18]
The Moz1 uses a wheeled humanoid form factor, combining a human-like upper body with an omnidirectional wheeled base rather than bipedal legs. This design prioritizes stability, energy efficiency, and practical indoor navigation over walking capability.[2][3]
| Category | Specification | Details |
|---|---|---|
| Degrees of freedom | Total body | 26 DOF (excluding dexterous hands) |
| Degrees of freedom | Per arm | 7 DOF (bionic arm configuration) |
| Arm control | Control mode | Force/torque-position hybrid control |
| Arm reach | Per arm | ~1 meter |
| Arm repeatability | Positioning | +/-0.5 mm |
| Actuators | Joint type | Integrated force-controlled joints |
| Actuators | Power density | Claimed highest in industry (15% higher than Tesla Optimus) |
| Mobility | Base type | Four-wheel omnidirectional drive |
| Mobility | Terrain adaptability | Multiple complex indoor terrains |
| Perception (head) | Cameras | Dual RGB + RGB-D depth cameras |
| Perception (head) | LiDAR | Optional 3D LiDAR |
| Perception (head) | Audio | Six-microphone circular array |
| Perception (base) | Navigation sensors | Ultrasonic sensors + 360-degree LiDAR |
| Safety | Features | Whole-body STO (Safe Torque Off) safety, high-precision collision detection |
| Software | AI model | Spirit V1 VLA (Vision-Language-Action) |
| Software | Type | Closed source (proprietary) |
A defining feature of the Moz1 is its self-developed integrated force-controlled joints. Unlike many humanoid robots that rely on position-controlled servos, the Moz1's joints provide continuous force and torque sensing and control at every actuated joint across the body. This enables compliant interaction with the environment, where the robot can dynamically adjust the force it applies based on what it feels. The joint design uses compact harmonic precision gears and achieves a power density that Spirit AI claims is 15% higher than that of Tesla's Optimus humanoid robot.[2][3]
The whole-body force control (WBC) system combines kinematics, dynamics, and force/torque control algorithms that run at high frequency, allowing the robot to perform delicate manipulation tasks such as handling flexible materials and inserting connectors into sockets.[2][19]
Each of the Moz1's two arms has 7 degrees of freedom arranged in a bionic configuration with wrist offsets, classified as a NonSRS (non-spherical redundant serial) 7-DOF manipulator. The arms use force/torque-position hybrid control, meaning they can switch between precise positional control and compliant force control depending on the task. The arm design allows anthropomorphic fine-grained operations, and each arm can reach approximately 1 meter with a positioning repeatability of +/-0.5 millimeters.[2][20]
Researchers at Spirit AI published an academic paper describing an analytical inverse kinematic solution for the Moz1 arm that provides all 16 possible solutions per pose and avoids algorithmic singularities within the workspace.[20]
The Moz1's head houses dual RGB cameras and RGB-D depth cameras for visual perception, with an optional 3D LiDAR for enhanced spatial mapping. A six-microphone circular array enables voice interaction and natural language command processing. The wheeled base includes its own sensor suite with ultrasonic sensors and a 360-degree LiDAR for autonomous navigation and obstacle avoidance.[2]
Rather than bipedal legs, the Moz1 uses a high-dynamic omnidirectional wheeled base with a four-wheel drive system. This design choice provides several advantages for indoor and industrial environments: stable mobility across flat surfaces, efficient energy use compared to bipedal walking, and the ability to perform flexible steering and omnidirectional movement. The base adapts to various complex terrains commonly found in factories, warehouses, and commercial spaces.[2]
The Moz1 is powered by Spirit AI's proprietary Spirit V1 Vision-Language-Action model, which serves as the robot's "embodied brain." The model integrates visual perception, language understanding, and action generation into a unified end-to-end architecture, reducing errors that occur in traditional modular robotics systems where perception, planning, and control are handled by separate modules.[1][21]
Spirit V1 uses a unified VLA architecture where a single neural network processes camera images, interprets natural language instructions, and generates motor commands. This end-to-end approach allows the robot to learn complex behaviors directly from demonstrations and instructions without requiring explicit programming for each task. The model includes a proprietary temporal modulation system that enables fluid motion control with variable speeds during task execution, preventing the jerky movements common in many robotic systems.[8][21]
One of Spirit AI's key innovations is its approach to training data. Rather than relying on highly curated, scripted demonstrations (sometimes called "clean data"), Spirit V1 is largely trained on open-ended, goal-driven diverse data. In this paradigm, human operators pursue high-level objectives without predefined action scripts, allowing the training data to naturally capture a continuous flow of skills including task transitions, recovery behaviors, and interactions across varied objects and environments.[21][22]
Spirit AI trains primarily on internet data (over 95% of training data), supplemented with teleoperation data for fine motor skill refinement. By April 2026, the company had accumulated over 200,000 hours of multimodal interaction data spanning internet video, teleoperation, and wearable capture systems, with a roadmap to exceed 1 million hours by the end of 2026.[8][16][22]
Co-founder Gao Yang has stated that "dirty data is the key to scaling VLA models," arguing that real-world variability is essential for developing robots with general reasoning ability.[22]
In January 2026, Spirit AI released Spirit v1.5, an upgraded version of the foundation model. Spirit v1.5 achieved the number one ranking on the RoboChallenge benchmark, a standardized real-robot evaluation platform jointly initiated by organizations including Dexmal and Hugging Face. The model scored 66.09 overall with a task success rate of 50.33%, making it the only model to exceed 50% success on the benchmark. It outperformed pi0.5, the model developed by U.S.-based Physical Intelligence, which was previously the top-ranked system.[21][23][24]
RoboChallenge evaluates embodied AI systems through 30 real-world tasks including object placement, target recognition, and tool use, assessing capabilities across 3D localization, occlusion handling, temporal reasoning, long-horizon execution, and cross-robot generalization.[21][24]
Spirit AI open-sourced Spirit v1.5, releasing the model weights, core evaluation code, and documentation on GitHub and Hugging Face, enabling the global research community to independently verify the benchmark results.[21][24]
The most significant commercial deployment of the Moz platform occurred in December 2025, when Spirit AI's Xiao Mo robot began operations at CATL's Zhongzhou battery manufacturing base in Luoyang, Henan Province. This marked the world's first deployment of humanoid robots at scale on a power battery pack production line.[4][18]
The robots handle two critical processes on the line:
The tasks involve inserting battery connectors and handling flexible wire harnesses, operations that are traditionally difficult for conventional robots due to the unpredictability of flexible materials and the variability in connector positions. The Xiao Mo robot uses its VLA model to autonomously handle uncertainties such as material position deviations and connection point variations, dynamically adjusting its posture in real time.[4][18]
Performance results at the CATL facility include a connector insertion success rate exceeding 99%, cycle times matching those of skilled human workers, a tripling of daily workload compared to manual labor, and the ability to detect and report anomalies during operational breaks.[4][18]
In March 2026, Spirit AI formalized a strategic partnership with JD.com, integrating its Moz robot into JD MALL's smart retail ecosystem. The robots serve as automated baristas, preparing coffee with stable real-world operation. Beyond the immediate commercial function, the deployment serves as a data collection mechanism: the barista robots generate multimodal interaction data that feeds back into Spirit AI's model training pipeline, creating a closed-loop system linking live data collection with continuous model iteration.[5][16]
Spirit AI targets both industrial and service applications for the Moz1 platform. In industrial settings, the robot addresses tasks in battery manufacturing, automotive assembly, and logistics. In service contexts, the company envisions deployment in hospitality, retail, healthcare, and household environments. The company projects revenue exceeding 100 million yuan and shipments of several hundred units during 2026.[7][16]
The Moz1 enters a rapidly expanding market for humanoid and service robots, particularly in China, where government policies and corporate investment have accelerated development in embodied AI.
Spirit AI competes with several well-funded Chinese humanoid robot companies, including UBTECH (maker of the Walker S series), Unitree Robotics (maker of the H1 and G1), Agibot (backed by CATL), Fourier Intelligence (maker of the GR-2), and XPeng Robotics (maker of Iron). Spirit AI differentiates itself through its focus on full-body force control, its VLA-first approach to robot intelligence, and its "dirty data" training methodology.[22]
Globally, the Moz1 competes with platforms such as Tesla Optimus, Figure 02, Apptronik Apollo, and Agility Robotics Digit. Spirit AI's Spirit v1.5 model has directly outperformed the pi0.5 model from Physical Intelligence on the RoboChallenge benchmark, a result that attracted significant attention in the embodied AI community.[21][24]
In October 2025, Spirit AI announced a collaboration with Shanghai HiSilicon (a subsidiary of Huawei) on computing architecture integration, specifically involving the Ascend AI chip platform and the Hi3519DV500 processor. Industry analysts have noted that this collaboration could create a complete technology stack from foundational computing to application layers, comparable to Tesla's "Optimus + Dojo" approach.[7][15]