Robbyant (Ant Group)
Last reviewed
May 9, 2026
Sources
22 citations
Review status
Source-backed
Revision
v6 · 3,989 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 9, 2026
Sources
22 citations
Review status
Source-backed
Revision
v6 · 3,989 words
Add missing citations, update stale details, or suggest a clearer explanation.
Robbyant (officially Shanghai Ant Lingbo Technology Co., Ltd., 上海蚂蚁灵波科技有限公司) is a Chinese humanoid robotics and embodied AI company and subsidiary of Ant Group, the financial technology affiliate of Alibaba Group. Founded in late 2024 in Shanghai, Robbyant develops wheeled humanoid robots for service and commercial applications, and an open-source family of foundation models, branded LingBot, that are intended to act as a shared AI "brain" for robots from many different manufacturers. The company's flagship hardware product is the Robbyant R1, a dual-armed wheeled humanoid designed for hospitality, museum guidance, cooking, and remote-controlled operations.[1][2]
Led by chief executive Zhu Xing, Robbyant positions itself as an AI-first robotics company. Rather than competing primarily on actuators or chassis design, it focuses on building large foundation models for perception, planning, and control, then partnering with hardware specialists to deliver complete robots and "scenario solutions" to enterprise customers.[1][3]
Ant Group was established on October 16, 2014 in Hangzhou, having grown out of the Alipay payments service that Alibaba launched in 2004. After founder Jack Ma reduced his voting rights in January 2023, the company has been controlled by a diversified shareholder base, with no single party holding majority control. Ant Group's headquarters remain in Hangzhou's Xihu District, and the firm operates across digital payments, lending, insurance, wealth management, healthcare, and financial cloud services.[4]
In 2024, Ant Group reported approximately $3.26 billion in research and development spending, much of it directed toward artificial intelligence and large language model development. The company's Bailing (百灵) family of large language models and the later Ling series of trillion-parameter models became core pieces of its AI portfolio. Robotics was identified as a strategic extension of these AI investments, with Ant Group leadership describing humanoids as a way to push Ant's digital services into the physical world.[1][4]
Shanghai Ant Lingbo Technology was incorporated on December 17, 2024 with registered capital of 100 million yuan (about $13.73 million). The Chinese name "灵波" (Lingbo, literally "spirit wave") gave the company its English brand identity, while the consumer-facing brand "Robbyant" combines "Robot" and "Ant" to signal both its core technology and its corporate parent. The new entity was framed as Ant Group's core platform for robotics research, development, and sales, with an emphasis on "high-interaction" environments such as homes, eldercare, and medical settings.[2][5]
In the months after incorporation, Ant Lingbo began an aggressive recruiting drive. Job listings circulated in early 2025 sought senior motion-control engineers, hardware structure designers, and embodied AI algorithm researchers, with annual salaries reaching as much as 1.12 million yuan (about $153,800) for hardware structure leads and 1.04 million yuan for motion-control specialists. One listing explicitly described responsibility for "designing the bodies of humanoids, including joints, limbs and torsos," indicating that Ant intended to design its own robot hardware rather than rely entirely on third parties.[5]
Ant Lingbo Technology was formally launched in Pudong District, Shanghai, in March 2025, with operations beginning under chief executive Zhu Xing. A second entity was registered in Hangzhou in August 2025, anchoring the company's research and engineering presence next to Ant Group headquarters and giving it access to the broader Yangtze River Delta robotics supply chain.[1]
The company unveiled the Robbyant R1, its first humanoid robot, in September 2025 at two events on opposite sides of the world. On September 6, 2025, the R1 made its international debut at IFA 2025 in Berlin, where it was filmed cooking garlic shrimp inside a mock kitchen. Five days later, on September 11, 2025, a second R1 was the centerpiece of Ant Group's exhibit at the INCLUSION Conference on the Bund in Shanghai, an annual fintech and AI gathering hosted by Ant. At the Shanghai event, one R1 stood on a display plinth while a second prepared four dishes for visitors in a working kitchen booth.[1][6][7]
Staff at the Shanghai launch told reporters that the first generation R1 was already in mass production and had been shipped to clients including the Shanghai History Museum, where it was being used for tour guidance. Rather than selling individual units to consumers, Robbyant bundled the R1 into industry-specific "scenario solutions" that combined the robot with software, integration services, and service contracts.[1][8]
In early 2026, Robbyant pivoted publicly from a hardware-first narrative to an AI-first one, signaling that its long-term ambition is to build the foundation models that other companies' robots run on. Between January and April 2026, the company released an interlocking family of open-source models, all under the "LingBot" brand, covering depth perception, 3D mapping, world simulation, and end-to-end control.[3][9][10][11]
The initial wave included LingBot-Depth, an open-source spatial perception model co-developed with depth-camera maker Orbbec, which Ant unveiled on January 27, 2026; LingBot-VLA, a vision-language-action model released on January 28, 2026 and pitched as a "universal brain" for robots; and LingBot-World, a real-time world model for embodied AI announced shortly afterwards. LingBot-Map, a streaming 3D reconstruction model, followed on April 16, 2026.[3][9][10][11]
Alongside these technical releases, Robbyant began signing platform partnerships. On March 16, 2026, Robbyant announced a collaboration with humanoid robot maker Leju Robotics to integrate the LingBot stack with Leju hardware for industrial and commercial deployments. Leju had previously contributed nearly 10,000 hours of multimodal interaction data used to train LingBot-VLA, and the new agreement extended the relationship into joint go-to-market work.[12]
| Detail | Information |
|---|---|
| Legal name | Shanghai Ant Lingbo Technology Co., Ltd. |
| Chinese name | 上海蚂蚁灵波科技有限公司 |
| Brand | Robbyant |
| Parent | Ant Group |
| Ultimate affiliation | Alibaba Group (sister company) |
| Founded | December 17, 2024 |
| Formal launch | March 2025 |
| Headquarters | Pudong District, Shanghai |
| Secondary office | Hangzhou, Zhejiang (subsidiary registered August 2025) |
| Registered capital | 100 million yuan (about $13.73 million) |
| CEO | Zhu Xing |
| CTO | Shen Yujun |
| Industry | Humanoid robotics, embodied AI |
| Flagship product | Robbyant R1 |
| Foundation model family | LingBot |
| Markets served | China, Europe (DACH region) |
The company is led by Zhu Xing, who serves as chief executive officer. Zhu has positioned Robbyant as a software-first competitor in a hardware-heavy market. Speaking at the INCLUSION Conference, he described the company as newcomers to robotics whose advantage lay in "developing intelligence," and said Ant's goal was to "extend the services Ant provides in the digital world more effectively into the physical world."[1] In subsequent statements accompanying the LingBot releases, Zhu argued that "for embodied intelligence to achieve large-scale adoption, we need highly capable and cost-effective foundation models that work reliably on real hardware," and that "the embodied AI industry is evolving from technical verification to real-world deployment."[3][12]
Shen Yujun holds the role of chief technology officer and has been the public face of Robbyant's perception research, including LingBot-Depth. He has emphasized that "reliable 3D vision is critical to the advancement of embodied AI" and oversees the company's perception and world-model teams.[13]
Zheng Yuewen serves as country manager for the DACH region (Germany, Austria, Switzerland), responsible for partnerships and pilot deployments across continental Europe. Her appointment underscores the company's intent to enter European service markets in addition to China.[14]
The iF Design Award filing for the R1 lists a thirteen-person product design team, including Mingwei Zhang, Yupeng Hu, Yuxuan Liu, Yuping Deng, Xing Zhu, Zhiyong Wang, Yujun Shen, Yi Xie, Haitao Cao, Jiajia Li, Jiayu Pan, Zhewen Wu, and Jiguang Xiong.[15]
The Robbyant R1 is a wheeled, dual-armed humanoid robot designed for service industry applications. It is the first humanoid robot to ship from Ant Group and was described in Bloomberg's launch coverage as the company's "first entry in China's robot race."[6]
| Specification | Details |
|---|---|
| Height | 1.6 to 1.75 m (adjustable) |
| Weight | 110 kg (243 lb) |
| Degrees of Freedom | 34 |
| Maximum Speed | Under 1.5 m/s (about 5.4 km/h) |
| Mobility | Wheeled base |
| Arms | Dual-arm configuration, primary site of degrees of freedom |
| Actuation | 48 V DC geared motors with encoders |
| Sensing | Multimodal perception cameras, depth sensing, microphones |
| Connectivity | Wi-Fi, Bluetooth, cloud integration |
| Onboard AI | Ling series large language models, embodied AI control stack |
| Operating modes | Autonomous and teleoperated |
| Color options | White, silver |
| Reported reference price | About $55,000 per unit (not sold as standalone) |
| Status | Mass production from September 2025 |
The R1 is designed for versatile service roles, including:
In parallel with the R1 hardware, Robbyant maintains the Robbyant Open Platform, a software toolkit that lets integrators deploy R1 and compatible robots into venues such as supermarkets, exhibition halls, and offices. The platform exposes capabilities for marketing guidance, guided tours, and cooking demonstrations, and it forms part of the iF Design Award 2026 submission for the R1.[15]
The LingBot family is Robbyant's open-source AI platform for embodied agents. Each model targets a different layer of the perception-and-control stack and is released on GitHub under the Apache License 2.0.[3][9]
| Model | Release | Function | Key result |
|---|---|---|---|
| LingBot-Depth | January 27, 2026 | Spatial perception, depth completion from RGB plus partial depth | Over 70 percent reduction in relative depth error on NYUv2; about 47 percent RMSE reduction on sparse Structure-from-Motion |
| LingBot-VLA | January 28, 2026 | Vision-language-action "universal brain" for cross-platform robot control | 1.5x to 2.8x faster training than comparable frameworks; trained on 20,000+ hours of dual-arm interaction data |
| LingBot-World | January 2026 | Real-time world model for interactive simulation and embodied AI | Millisecond-level real-time interaction, video prediction-based reasoning |
| LingBot-Map | April 16, 2026 | Streaming 3D reconstruction from a single RGB camera | F1 score of 98.98 on ETH3D, more than 21 points above the second-place method; about 20 FPS continuous inference on sequences exceeding 10,000 frames |
LingBot-Depth is a depth-completion and spatial perception model co-developed with Orbbec and announced in Shanghai on January 27, 2026. It introduces a Masked Depth Modeling (MDM) training scheme: when sensor depth is missing or corrupted, the model uses RGB texture, object contours, and scene context to reconstruct the missing regions. To train the model, Robbyant collected about 10 million raw RGB-D samples and curated a high-quality dataset of 2 million RGB-depth pairs targeted at extreme conditions such as glass, mirrors, and polished metal.[9][13]
Orbbec contributed its Gemini 330 stereo camera and the proprietary MX6800 depth engine chip during co-development, and validated final performance through its Depth Vision Laboratory. On NYUv2 and ETH3D benchmarks, LingBot-Depth outperformed reference models including PromptDA and PriorDA, achieving more than a 70 percent reduction in relative depth error in indoor scenes and roughly a 47 percent RMSE reduction on the challenging sparse Structure-from-Motion task. The complete model and dataset tooling are hosted at github.com/Robbyant/lingbot-depth.[9][13]
LingBot-VLA, released on January 28, 2026, is a vision-language-action model (VLA) intended to act as a portable control layer that can be retargeted to many different robot bodies with limited additional training. The model was pre-trained on more than 20,000 hours of real-world interaction data spanning nine dual-arm robot configurations, with depth perception integrated through the companion LingBot-Depth model.[3][12]
In benchmarks, LingBot-VLA outperformed peer models on the GM-100 real-robot benchmark of 100 manipulation tasks and was also tested on the RoboTwin 2.0 simulation benchmark with environmental variations. Robbyant reports that the system trains 1.5 to 2.8 times faster than comparable frameworks. Early adapter integrations cover single-arm, dual-arm, and humanoid robots from partners including Galaxea Dynamics and AgileX Robotics. The release ships as a production-ready package with code, documentation, model weights, data processing tools, and an automated evaluation pipeline.[3][12]
LingBot-World is a real-time world model that supports interactive simulation and embodied AI training. It targets video quality, dynamic fidelity, long-term consistency, and millisecond-level interactive response, allowing simulated agents and human users to alter scenes on the fly. The model is positioned as a training and evaluation environment for the rest of the LingBot stack and is also published as open source.[18]
LingBot-Map is a feed-forward 3D foundation model that performs streaming 3D reconstruction from a single RGB camera, making it possible to use robots, autonomous vehicles, and augmented reality devices to perceive and rebuild three-dimensional surroundings in real time. Built on a Geometric Context Transformer (GCT) architecture, the model operates on a "see-as-you-go" principle: it estimates the camera pose and reconstructs the scene frame-by-frame as new video arrives, rather than waiting for a complete batch of images.[10][11]
LingBot-Map runs at roughly 20 frames per second and supports continuous inference on long video sequences exceeding 10,000 frames with little loss of accuracy. On the ETH3D reconstruction benchmark, it achieved an F1 score of 98.98, more than 21 percentage points above the next best method. An accelerated update was released later in April 2026, and the code is hosted at github.com/robbyant/lingbot-map with a companion model card on Hugging Face.[10][11]
Robbyant's technology stack treats robotics as a layered AI problem. The Ling series of large language models from Ant Group, including trillion-parameter variants, supplies higher-level reasoning, planning, and dialogue, while the LingBot family handles real-time perception, world modeling, and motor control. The R1 itself uses a dual-driven embodied control algorithm that combines simulation and real-world data, achieving what Robbyant describes as millisecond-level responsiveness with action efficiency approaching human levels.[15][1]
The Bailing LLM, originally an internal Ant Group model, provided early language capabilities for the R1, while the company has progressively migrated to the more capable Ling family for multimodal perception and complex task planning. The result is a robot that can take a natural-language instruction such as a recipe name, decompose it into sub-tasks, perceive ingredients and tools, and execute the steps autonomously, while still allowing a remote human operator to take over via teleoperation when needed.[1][7]
The perception stack relies on multimodal cameras paired with depth sensors. LingBot-Depth provides robust depth completion in environments with reflective surfaces and transparent objects that historically defeat consumer-grade depth cameras. LingBot-Map adds streaming 3D reconstruction so the robot can build a live map of an unfamiliar room, while LingBot-World supplies the simulated environments used to train and evaluate behaviors before they are deployed onto physical hardware.[9][10][18]
Unlike pure-software robotics startups, Robbyant designs its own humanoid hardware. The R1 chassis uses 48 V DC geared motors with encoders for closed-loop control, a wheeled mobile base for stability and indoor speed, and an upper body with two arms that hold the bulk of its 34 degrees of freedom. The wheeled approach is a deliberate choice: rather than chasing bipedal locomotion, Robbyant prioritizes stable manipulation in flat indoor environments such as kitchens, hotel lobbies, hospitals, and museums.[1][15]
Robbyant's commercial approach differs from many humanoid robot companies in that it does not sell individual robot units directly to consumers or hobbyists. Instead, the company provides integrated "scenario solutions" where the R1 is deployed as part of a comprehensive service package covering hardware, software, training, integration, and ongoing maintenance. Industry coverage cited a reference price of roughly $55,000 per unit, but the actual contracts are bundled rather than itemized.[1][8]
This model reflects Ant Group's broader strategy of offering platform-based solutions rather than standalone hardware products. Customers such as the Shanghai History Museum receive an end-to-end deployment that integrates the R1 with the venue's existing software and operational workflows, including tour scripts, ticketing systems, and visitor management. By tying its robots to scenario contracts, Robbyant gains a recurring revenue stream and a steady flow of operational data that feeds back into the LingBot models.[8][16]
The parallel open-source strategy for LingBot has its own commercial logic. By giving away the AI "brain," Robbyant aims to encourage hardware partners and integrators to build on its stack, much as cloud companies use open-source projects to attract usage of their managed services. Cooperation with Orbbec, Leju Robotics, Galaxea Dynamics, and AgileX Robotics effectively creates a platform ecosystem in which Robbyant retains influence over the controlling software while letting other companies handle parts of the hardware stack.[3][9][12]
The earliest confirmed customer for the R1 is the Shanghai History Museum, where the robot guides visitors through exhibitions and answers questions. Robbyant has also reported pilot deployments in scenic spots and shopping malls for navigation and guidance, and in restaurant and community kitchens for cooking demonstrations. Healthcare pilots cover medicine sorting and basic consultation roles in hospitals and pharmacies.[1][16][17]
In Europe, the company has begun developing partnerships in the DACH region under country manager Zheng Yuewen, exploring applications in service venues and embodied AI research. By December 2025, Ant Group leadership had publicly described plans to scale these European pilots into broader commercial deployments, signaling that Robbyant's ambitions are not limited to the Chinese market.[14][1]
The Robbyant R1 received an iF Design Award in the 2026 cycle in the Product Design category, Household Appliances subcategory. The award filing describes the R1 as "the industry's first wheeled humanoid robot dedicated to home service scenarios," highlighting its multi-modal perception, dual-driven embodied control algorithm, and millisecond-level responsiveness. The submission also references the Robbyant Open Platform that supports deployments in supermarkets, exhibition halls, and offices, with target regions listed as Asia and Europe.[15]
Robbyant launched into a Chinese humanoid robotics market that was rapidly drawing investment from the country's largest technology companies. By 2025, Baidu, Tencent, JD.com, Huawei, and Xiaomi had all announced humanoid or embodied AI initiatives, while specialist firms such as Unitree Robotics, Agibot, UBTech, and Astribot were already shipping bipedal and wheeled platforms. Industry forecasts cited by Chinese state media projected the domestic humanoid market would grow from about 2.76 billion yuan in 2024 toward roughly 12 billion yuan by 2030, with annual unit shipments approaching 1.5 million.[5][19]
In 2025, Chinese regulators publicly cautioned that the humanoid robot boom risked turning into a speculative bubble, prompting some industry executives to argue more loudly for visible commercial deployments rather than pure technology demonstrations. Robbyant's emphasis on shipping the R1 into paying customers and on releasing reusable open-source models is consistent with this push toward measurable real-world progress.[20]
Beyond Ant Lingbo, Ant Group has invested in adjacent embodied AI startups. In 2024, Ant participated in a Pre-A funding round for SoHoBlink, focused on medical, education, and family scenarios, and in 2025 it joined a financing round for Shenzhen-based humanoid robot maker Astribot.[5]
| Company | Country | Approach | Notable platform |
|---|---|---|---|
| Robbyant | China | Wheeled humanoid, AI-first, scenario solutions | Robbyant R1 |
| Unitree Robotics | China | Bipedal humanoids and quadrupeds, hardware-first | Unitree H1, G1 |
| Agibot (Zhiyuan) | China | Bipedal humanoid for industrial service | A2, A1 |
| Leju Robotics | China | Bipedal humanoid hardware, partner with Robbyant on AI | Kuavo |
| Astribot | China | Wheeled bimanual platforms | S1 |
| UBTech | China | Industrial and consumer humanoids | Walker S |
| Figure AI | United States | Bipedal humanoid for warehouse and home | Figure 02 |
| Tesla | United States | Bipedal general-purpose humanoid | Optimus |
In comparative coverage, the Robbyant R1 is often cited alongside Tesla Optimus as a Chinese counterpart, though the two differ in form factor: Optimus is bipedal and general-purpose, while the R1 is wheeled and service-focused.[21][22]
Initial press reception for the R1 was a mix of curiosity and skepticism. Bloomberg framed the launch as Ant's "first entry in China's robot race," and South China Morning Post highlighted that the unit had already been shipped to the Shanghai History Museum.[6][1] Other commentators noted that wheeled service robots have been deployed in Chinese restaurants and hotels for years, and that the R1's distinctive contribution lies less in its mechanical novelty than in the depth of AI integration and the scale of Ant Group's distribution channels.[8][16]
The LingBot open-source releases received more uniformly positive coverage in technical press, where the published benchmarks against NYUv2, ETH3D, GM-100, and RoboTwin 2.0 gave reviewers concrete numbers to evaluate.[3][10]