![]() | |
| Developer | X Square Robot |
| Type | Wheeled humanoid robot |
| Country of origin | China |
| Unveiled | 2025 World Robot Conference, Beijing (August 2025) |
| Height | 172 cm (5 ft 8 in) |
| Weight | 95 kg (209 lb) |
| Width | 540 mm |
| Depth | 756 mm |
| Degrees of freedom | 62 (total) |
| Hand DOF | 20 per hand |
| Arm DOF | 7 per arm |
| Arm reach | 756 mm (~75 cm) |
| Payload | 6 kg per arm |
| Max speed | 1 m/s (3.6 km/h) |
| Battery life | Up to 4 hours (240 min) |
| Locomotion | 6-DOF wheeled mobile base |
| AI system | WALL-A VLA model |
| Sensors | LiDAR, stereo cameras, IMU, ultrasonics |
| Actuators | Tendon-driven (ArtiXon Hand system) |
| Price | ~$80,000 USD |
| Website | x2robot.com |
QUANTA X2 is a wheeled humanoid robot developed by X Square Robot (formally X Square Robot Technology (Shenzhen) Co., Ltd.), a Chinese embodied AI startup founded in December 2023. The robot was unveiled at the 2025 World Robot Conference in Beijing and represents the company's flagship humanoid platform, designed for general-purpose service tasks across homes, public venues, logistics hubs, and research environments.[1][2]
Standing 172 cm tall and weighing 95 kg, the QUANTA X2 features 62 degrees of freedom, 7-DOF arms, and 20-DOF dexterous hands with integrated tactile sensing. The robot is powered by X Square Robot's proprietary WALL-A embodied intelligence model, a vision-language-action (VLA) system that unifies perception, reasoning, and motor control into a single end-to-end framework. With approximately $423 million in total funding from investors including Alibaba, ByteDance, and Meituan, X Square Robot has become one of the most heavily funded embodied AI startups in China.[3][4]
X Square Robot Technology (Shenzhen) Co., Ltd., known internationally as X Square Robot, was founded in December 2023 in Shenzhen, China. The Chinese name of the company, "Zibianliang" (literally "independent variable"), reflects its focus on developing a universal embodied intelligence foundation model that serves as the core "variable" enabling general-purpose robotic capabilities. The company established a Beijing branch in March 2024 to support its growing research operations.[5]
X Square Robot positions itself as one of the first companies in China to pursue a fully end-to-end unified large-model pathway to general embodied intelligence. Rather than relying on pre-programmed scripts or narrow task-specific systems, the company develops foundation models that allow robots to autonomously perceive, reason, plan, and execute complex tasks in unstructured real-world environments.[6]
Wang Qian, the founder and CEO, earned his bachelor's and master's degrees at Tsinghua University, one of China's most prestigious engineering institutions, and completed his doctorate at the University of Southern California. He is recognized as one of the earliest researchers worldwide to propose attention mechanisms in neural networks. During his doctoral work, Wang participated in several robotics learning research projects at a leading U.S. robotics laboratory, gaining experience that spanned nearly all fields related to robot manipulation and home service robotics.[7]
After academia, Wang briefly ran a quantitative hedge fund in the United States. However, he found the work unfulfilling compared to his passion for robotics. In 2023, he shut down the fund, returned to China, and founded X Square Robot in Shenzhen. In an interview with KrASIA, Wang stated: "Starting a company takes real resolve. If you already have a backup plan on day one, your mindset is flawed."[8]
Wang Hao, co-founder and CTO, holds a computational physics doctoral background from Peking University. He previously served as algorithm leader for the Fengshenbang large model team at the Guangdong-Hong Kong-Macao Greater Bay Area Institute, where he released China's first domestic multimodal open-source model "Taiyi" and contributed to large language models "Randan" and "Jiang Ziya."[5]
X Square Robot has raised approximately $423 million across nine financing rounds since its founding in December 2023, making it one of the most capitalized embodied AI startups globally. The company is the only domestic Chinese embodied AI enterprise backed simultaneously by all three internet giants: Alibaba, ByteDance, and Meituan.[3][4]
| Round | Date | Amount | Lead Investors | Notable Participants |
|---|---|---|---|---|
| Angel / Angel+ | April 2024 | Tens of millions RMB | Jiuhe Chuangtou | Early-stage investors |
| Pre-A / Pre-A+ | Mid-2024 | Hundreds of millions RMB | Delian Capital, Cornerstone Capital | Qifu Capital, Nanshan Zhancheng Xintou |
| Series A+ | September 2025 | ~RMB 1 billion ($140M) | Alibaba Cloud, CAS Investment | CDB Capital, Sequoia Capital China, INCE Capital, Meituan, Legend Star, Legend Capital |
| Series A++ | January 2026 | ~RMB 1 billion ($143M) | ByteDance, HongShan Capital Group | Beijing Information Industry Development Fund, Shenzhen Capital Group |
| Additional round | February 2026 | Several hundred million RMB | SAIC Capital, CICC/SAIC Motor fund | Meituan Longzhu, HongShan, state-backed funds |
The September 2025 Series A+ round was particularly significant because it marked Alibaba Cloud's first investment in an embodied intelligence company, signaling the tech giant's strategic entry into the humanoid robotics sector.[9] The January 2026 Series A++ round, co-led by ByteDance and HongShan (formerly Sequoia Capital China), was reported as one of the largest embodied AI financing deals at the start of 2026.[10] The February 2026 round, led by SAIC Capital, introduced automotive-industry investors, with the company noting that this participation would "help speed the adoption of embodied AI in vehicle manufacturing."[11]
The QUANTA X2 is a wheeled humanoid robot standing 172 cm (5 ft 8 in) tall and weighing 95 kg (209 lb) with battery. The body measures 540 mm wide and 756 mm deep. The robot uses a composite shell over an alloy frame, providing structural rigidity while keeping weight manageable for indoor mobility. Unlike bipedal humanoids such as Boston Dynamics' Atlas or Unitree' H1, the QUANTA X2 uses a 6-DOF wheeled mobile base optimized for smooth indoor navigation rather than walking.[1][12]
The wheeled design reflects CEO Wang Qian's philosophy that humanoid robots should prioritize generalization and practical task completion over locomotion showmanship. In a 2025 interview, Wang dismissed factory robotics as "just a PR stunt," arguing that putting humanoids in factories to do repetitive tasks misses the real opportunity. He contends that meaningful embodied AI development requires "complexity, randomness, and open-ended interaction" found in service environments rather than structured industrial settings.[8]
The QUANTA X2 has 62 total degrees of freedom distributed across its body, providing flexible and lifelike motion capabilities. The breakdown includes 7 DOF per arm (14 total), 20 DOF per hand (40 total for both hands), and additional DOF for the torso, head, and mobile base. This high DOF count positions the QUANTA X2 among the more articulated wheeled humanoids available, particularly in hand dexterity.[1]
The QUANTA X2's hands are its most technically distinctive feature. The robot uses the ArtiXon Hand, an in-house bionic hand system with five fingers per hand and 20 degrees of freedom per hand. The ArtiXon Hand uses tendon-driven motors paired with miniature gear reducers for precision control, enabling both delicate and strong grasping.[12][13]
Key hand specifications include:
| Parameter | Value |
|---|---|
| DOF per hand | 20 |
| Fingers per hand | 5 |
| Drive mechanism | Tendon-driven motors |
| Gear technology | Miniature gear reducers |
| Precision | Sub-millimeter repeatability (~0.001 inches / 0.025 mm) |
| Tactile sensing | Integrated pressure sensors |
| Grip modes | Delicate grasp, power grasp, tool operation |
The tactile sensors integrated into each fingertip allow the hands to perceive subtle pressure changes during manipulation. This enables tasks requiring finesse, such as grasping delicate items, threading needles, twisting small objects, and handling items of varying shapes and textures. The hands also support a modular clamp system that allows attachment of brushes or mop heads for 360-degree surface cleaning.[13][14]
Each arm features 7 degrees of freedom with a reach of approximately 756 mm (~75 cm) and supports a 6 kg payload at the end effector. The arms use a tendon-drive actuation system consistent with the hand design, providing smooth motion with sub-millimeter repeatability for delicate manipulation tasks. The 6 kg payload capacity per arm is sufficient for typical household and service objects such as plates, tools, cleaning supplies, and packaged goods.[1][12]
The 6-DOF wheeled mobile base provides omnidirectional movement with a maximum speed of 1 m/s (3.6 km/h). The base can navigate slopes up to 15 degrees and step heights up to 40 mm, allowing operation across typical indoor environments with minor obstacles and thresholds. The wheeled design offers advantages over bipedal locomotion in terms of stability, energy efficiency, and predictable motion in service settings.[1]
The QUANTA X2 employs a multi-sensor fusion stack for environmental perception and navigation:
| Sensor Type | Function |
|---|---|
| LiDAR | 3D mapping and obstacle detection |
| Stereo cameras | Visual perception and object recognition |
| IMU (Inertial Measurement Unit) | Orientation and motion tracking |
| Ultrasonic sensors | Close-range obstacle detection |
| Tactile sensors (hands) | Pressure sensing for manipulation |
This sensor suite feeds data into the WALL-A model for real-time environmental understanding, path planning, and task execution.[1][12]
The robot operates for up to 4 hours (240 minutes) on a single battery charge. Some sources cite a shorter runtime of approximately 2 hours under high-intensity manipulation workloads, suggesting that battery life varies significantly depending on the mix of locomotion, arm movement, and hand manipulation during operation.[1][12]
The QUANTA X2 is powered by X Square Robot's proprietary WALL-A, a vision-language-action (VLA) embodied intelligence model that serves as the robot's "brain." WALL-A represents one of the first end-to-end embodied intelligence large models developed in China, integrating perception, planning, and control into a unified system that functions as what the company describes as a "cerebrum-cerebellum system" for robots.[6][15]
WALL-A integrates VLA models with world models and uses causal inference to understand environmental feedback. The system processes multimodal inputs (vision, language, and sensor data) and directly outputs motor commands (speed, position, and torque) through a single end-to-end model. This approach eliminates the need for separate perception, planning, and control modules that traditionally require hand-coded interfaces between stages.[6][15]
The model incorporates an "embodied chain-of-thought reasoning framework" that enables robots to plan actions, execute them, and self-correct in a closed loop. This framework allows the robot to process multimodal inputs while generating outputs that create what X Square describes as "a complete closed loop of autonomous decision-making, execution, exploration, and reflection."[15]
WALL-A is trained through large-scale, real-robot reinforcement learning, allowing the foundation model to learn through direct physical interaction with environments. This data-driven approach enables the robot to autonomously refine its skills over time. The company also uses exoskeleton-based teleoperation systems for data acquisition, collecting demonstrations that inform the model's understanding of manipulation tasks.[6][9]
CEO Wang Qian has compared the current state of embodied AI models to GPT-2 in the evolution of large language models, projecting that GPT-3-level capabilities in embodied AI could emerge within approximately one year, with real commercial applications following one to two years after that.[8]
WALL-A's technical strategy is built around two principles of unification:
By using world models to predict outcomes and causal inference to interpret feedback, WALL-A enhances zero-shot generalization, allowing the robot to handle novel tasks and environments it has never specifically trained on. During demonstrations, the QUANTA X1 (the company's bimanual wheeled robot also powered by WALL-A) successfully completed autonomous food delivery in open outdoor environments, handling challenges such as strong winds, deformed packaging, and visual occlusions using zero-shot generalization.[4][15]
In September 2025, alongside its Series A+ funding announcement, X Square Robot released WALL-OSS, an open-source version of its embodied foundation model family designed to democratize embodied intelligence and accelerate community-driven innovation.[14][16]
WALL-OSS is an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve embodiment-aware vision-language understanding, strong language-action association, and robust manipulation capability. The model introduces several key innovations:[16][17]
The model's architecture builds on a Qwen2.5 VLMoE vision-language model foundation and integrates proprioceptive information for embodied understanding. Two model variants are available: WALL-OSS-FLOW (using flow-matching-based action prediction) and WALL-OSS-FAST (an alternative action branch approach).[17]
WALL-OSS and its training code are available on GitHub (under the X-Square-Robot organization) and Hugging Face. The model is integrated into Hugging Face's LeRobot ecosystem, providing end-to-end pipelines for data preparation, model configuration, training, and evaluation. Unlike task-specific systems that fail outside narrow scenarios, WALL-OSS is designed to generalize across multiple robot types, making it accessible to third-party developers and research institutions.[16][17]
X Square Robot has developed two primary robot platforms, both powered by the WALL model family:
| Robot | Type | Description |
|---|---|---|
| QUANTA X1 | Wheeled bimanual robot | Compact platform with dual 7-DOF arms; demonstrated autonomous food delivery and logistics sorting |
| QUANTA X2 | Wheeled humanoid robot | Full-size humanoid with 62 DOF, 20-DOF dexterous hands, and tactile sensing for general-purpose service |
The company also develops in-house components including robotic arms, joint modules, and controllers, supporting mass production and commercial scaling. The six-month development cycle for the QUANTA X2 encompassed the robot body, high-degree-of-freedom dexterous hands, and exoskeleton teleoperation and data acquisition equipment.[4][9]
The QUANTA X2 is designed for household tasks including surface cleaning (using its modular mop/brush attachment system), item organization, debris collection, and general tidying. The robot's tactile hands enable it to handle fragile household items, fold garments, and load/unload dishwashers.[13][14]
In March 2026, X Square Robot partnered with 58.com, one of China's largest household service platforms, to launch China's first home-cleaning robot service in Shenzhen. Under this model, customers book cleaning through the 58.com app and receive a dual team: a professional cleaner and an X Square robot. The human handles complex judgment-based tasks while the robot independently performs structured work. The 58.com partnership provides access to a network spanning over 200 cities and tens of millions of households, offering large-scale real-world testing opportunities.[18]
X Square Robot has deployed robots in retirement homes and elder care facilities, where robots assist with routine monitoring, item retrieval, and basic care support tasks. This sector represents one of the company's earliest revenue-generating application areas.[4][9]
The company has sold robots to schools and research institutions, providing platforms for robotics education and embodied AI research. The open-source WALL-OSS model further supports the academic community by offering accessible tools for robot learning experimentation.[4][16]
Hotels represent another early deployment sector, where robots can perform room tidying, item delivery, and guest assistance functions. X Square Robot was already generating revenue from hotel customers as of September 2025.[14]
The QUANTA X2's manipulation capabilities and WALL-A's zero-shot generalization make it suitable for logistics sorting, assembly assistance, and warehousing operations. The February 2026 investment round by SAIC Capital specifically targeted accelerating adoption in vehicle manufacturing contexts. However, CEO Wang Qian has publicly expressed skepticism about factory robotics as a primary application, arguing that structured factory environments provide insufficient training signal for developing truly general embodied intelligence.[8][11]
| Category | Parameter | Value |
|---|---|---|
| Physical | Height | 172 cm (5 ft 8 in) |
| Physical | Weight | 95 kg (209 lb) with battery |
| Physical | Width | 540 mm |
| Physical | Depth | 756 mm |
| Physical | Frame | Alloy frame with composite shell |
| Mobility | Total DOF | 62 |
| Mobility | Locomotion type | 6-DOF wheeled mobile base |
| Mobility | Max speed | 1 m/s (3.6 km/h) |
| Mobility | Max slope | 15 degrees |
| Mobility | Max step height | 40 mm |
| Arms | DOF per arm | 7 |
| Arms | Arm reach | 756 mm (~75 cm) |
| Arms | Payload per arm | 6 kg |
| Arms | Actuation | Tendon-drive |
| Hands | DOF per hand | 20 |
| Hands | Fingers per hand | 5 |
| Hands | Hand system | ArtiXon (in-house bionic hand) |
| Hands | Drive mechanism | Tendon-driven motors with miniature gear reducers |
| Hands | Precision | Sub-millimeter repeatability |
| Hands | Tactile sensing | Integrated pressure sensors |
| Sensors | Vision | Stereo cameras |
| Sensors | Range sensing | LiDAR |
| Sensors | Proximity | Ultrasonic sensors |
| Sensors | Orientation | IMU |
| Power | Battery life | Up to 240 min (4 hours) |
| Software | AI model | WALL-A VLA |
| Software | Open-source model | WALL-OSS (available on GitHub, Hugging Face) |
| Pricing | Estimated price | ~$80,000 USD |
The QUANTA X2 operates within a rapidly growing Chinese humanoid robotics market that dominated global shipments in 2025, accounting for roughly 90% of total humanoid robot sales worldwide. China introduced 21 new humanoid models in 2025 alone, up from just three in 2022.[19]
| Company | Robot | Type | Key Differentiator | Approx. Price |
|---|---|---|---|---|
| X Square Robot | QUANTA X2 | Wheeled humanoid | WALL-A VLA model, 20-DOF tactile hands | ~$80,000 |
| Unitree Robotics | G1 / H1 | Bipedal humanoid | High volume (5,500 units shipped in 2025), low cost | $16,000+ |
| Agibot | A2 | Bipedal humanoid | 5,168 units shipped in 2025, SAIC backing | ~$14,500+ |
| Figure AI | Figure 02 | Bipedal humanoid | Helix VLA, BMW factory deployment | ~$100,000 |
| Tesla | Optimus | Bipedal humanoid | Massive manufacturing scale potential | $20,000-$30,000 (target) |
| Boston Dynamics | Atlas (electric) | Bipedal humanoid | 56 DOF, 50 kg lift capacity | Not disclosed |
X Square Robot differentiates itself from competitors primarily through its software-first approach and its focus on general-purpose service rather than factory automation. While Unitree and Agibot have achieved volume leadership and are preparing for IPOs with multi-billion-dollar valuations, X Square Robot allocates roughly two-thirds of its budget to model development rather than hardware manufacturing scale.[8][19]
CEO Wang Qian has claimed that X Square's embodied AI models are on par technically with those of Physical Intelligence (Pi) and Google, stating that the company has "outperformed them in some metrics." He has also expressed skepticism about the value of open-source model replication, noting that Physical Intelligence open-sourced its Pi-0 model partly because deployment challenges made independent commercialization difficult.[8]
In April 2026, X Square Robot hosted the inaugural Embodied AI Developers Conference (EAIDC 2026), described as "the world's first" global gathering specifically dedicated to developers building embodied AI systems. The event featured live robotic demonstrations, a national-level hackathon competition, and academia-industry collaboration panels focused on deployment and commercialization.[20]
The hackathon tested four core capability areas: grasping and placement, language understanding, fine manipulation, and long-horizon decision-making. Specific challenges included ring placement, instruction-based fruit sorting, cable plugging, and word spelling. The competition introduced innovations such as randomized real-world environments to test robot adaptability, continuous system evaluation, and full end-to-end deployment workflows.[20]
CEO Wang Qian has outlined an ambitious timeline for the company's development. He projects that embodied AI models comparable to GPT-3 could emerge within one year, with real commercial applications following one to two years after that milestone. For household robot deployment at scale, he estimates a timeline of three to five years. Wang has stated that the company is not focused on near-term monetization, with investor confidence allowing the team to pursue meaningful technological breakthroughs over superficial commercialization.[8]
The company's 2026 Government Work Report alignment is notable: the Chinese government explicitly identified "Embodied Intelligence" as a key future industry to cultivate, with the Ministry of Industry and Information Technology releasing new 2026 standards for humanoid robotics. This policy environment provides favorable conditions for X Square Robot's continued growth.[18]