ALOHA 2
Last reviewed
May 16, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 2,049 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 16, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 2,049 words
Add missing citations, update stale details, or suggest a clearer explanation.
ALOHA 2 (sometimes written ALOHA-2) is an open-source bimanual teleoperation hardware platform introduced in February 2024 by researchers at Google DeepMind in collaboration with the original ALOHA team at Stanford University. It is the second generation of the ALOHA system, an acronym for A Low-cost Open-source Hardware System for Bimanual Teleoperation, first presented by Tony Z. Zhao and collaborators at Robotics: Science and Systems in 2023. ALOHA 2 keeps the core puppeteering design of the original, where a human operator backdrives two smaller leader arms that are kinematically matched to two larger follower arms, but introduces a redesigned gripper, lower-friction passive joints, an updated frame, and other ergonomic and reliability changes meant to make data collection sessions longer and more comfortable. The platform is widely used to collect demonstrations for imitation learning and to train generalist robot policies, including several vision-language-action (VLA) models such as RT-X follow-ups, Octo, and the π0 series.
The hardware, simulation models, and reference code for ALOHA 2 are released under permissive licenses on the project site aloha-2.github.io, and commercial kits are produced by Trossen Robotics. As of 2024 to 2026, ALOHA 2 has become one of the de facto reference cells for academic and industrial work on bimanual fine manipulation, in part because it costs roughly an order of magnitude less than research-grade dual-arm systems built around Franka Emika Pandas or Universal Robots UR5e arms.
The first ALOHA system was developed at Stanford University by Tony Z. Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn, and described in the 2023 paper Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. The hardware paired two ViperX 300 S follower arms with two WidowX 250 S leader arms, all built around Dynamixel servos from Robotis. A human operator grasped the leader arms and moved them through a task; the follower arms mirrored the motion in real time, recording joint angles and RGB video from three Logitech webcams plus a wrist camera. The full bill of materials came in under about $20,000, roughly 5 to 10 times cheaper than comparable bimanual research setups at the time.
Alongside the hardware, the Stanford team released Action Chunking with Transformers (ACT), an imitation learning algorithm that predicts short sequences of future actions rather than a single next action. ACT helped the policies learn fine-grained skills such as opening a Ziploc bag, slotting a battery, and threading a zip tie from only 50 demonstrations.
In January 2024, Zhao and collaborators released Mobile ALOHA, which mounted the original ALOHA arms on a wheeled base with a powered torso lift. The base let the system collect demonstrations of whole-body tasks such as cooking shrimp, calling an elevator, and rinsing a pan in a sink. Mobile ALOHA used the same teleoperation principle but added two additional follower wheels and a leader yoke that the operator pushed to drive the base. The Mobile ALOHA paper and viral demo videos drew significant attention on social media and helped trigger the wider 2024 wave of household robot learning work.
ALOHA 2 was announced in a short technical report titled ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleoperation, released on arXiv in February 2024 by the ALOHA 2 team at Google DeepMind together with the Stanford ALOHA authors. The stated goal was not to redesign the platform from scratch but to fix the small frictions that had emerged from a year of heavy use across many labs.
The main changes compared with the original ALOHA are summarised below.
| Subsystem | Original ALOHA (2023) | ALOHA 2 (2024) | Reason given by the authors |
|---|---|---|---|
| Gripper | Stock ViperX scissor gripper, hobby servo driven | Redesigned low-profile parallel-jaw gripper with custom 3D-printed fingers and silicone pads | More consistent grasps on thin or deformable objects; easier to swap fingertips |
| Leader arm passive joints | Friction-based hold using servo torque | Low-friction passive gravity-compensated linkage | Reduces operator fatigue during long demo sessions |
| Frame | Aluminium extrusion table built per lab | Standardised cell with integrated camera mounts | Repeatable setup across sites, easier to share data |
| Cameras | Three Logitech C922 plus one wrist camera | Higher-resolution Intel RealSense and ZED options officially supported, plus wrist cameras | Better depth and stereo for vision-language-action policies |
| Compute and timing | Off-board PC over USB, ROS bridge | Updated software stack with deterministic 50 Hz control loop and improved time synchronisation | Cleaner data for training generalist policies |
| Simulation | MuJoCo XML released after publication | MuJoCo model shipped at launch with matched dynamics | Sim-to-real and policy debugging |
| Documentation | Build guide and BOM on GitHub | Full assembly manual, troubleshooting guide, and certified vendor kit through Trossen Robotics | Lower barrier for new labs |
The paper notes that the total bill of materials is still in the same low-five-figure range as the original, with the gripper and camera upgrades being the largest cost drivers. The authors emphasise that ALOHA 2 is intentionally not a step toward a productised robot; it remains a research data collection cell.
ALOHA 2 became the data collection backbone for a string of high-profile robot learning papers in 2024 and 2025.
ALOHA Unleashed, released by Google DeepMind in mid-2024, used a fleet of ALOHA 2 cells to gather thousands of demonstrations of dexterous tasks such as tying shoelaces, hanging a shirt on a hanger, and inserting a gear into a recess. The policies were trained with a diffusion policy head on top of a transformer trunk and showed that with enough demonstrations the same architecture could handle long-horizon manipulation. The paper credits ALOHA 2's lower-friction leaders and more reliable grippers for making the multi-thousand-demonstration data collection effort tractable.
ALOHA 2 data also fed into the broader Open X-Embodiment effort and was used in follow-up work to RT-2. Several of the RT-X follow-up demonstrations of dexterous bimanual skills, including cloth folding and small parts assembly, were collected on ALOHA 2 hardware before being co-trained with mobile manipulator data from other labs. The Octo generalist policy from Berkeley was also evaluated on ALOHA-style bimanual tasks, although Octo was trained primarily on the Open X-Embodiment corpus rather than on ALOHA 2 data alone.
Outside Google, Physical Intelligence used ALOHA 2 cells as one of several teleoperation platforms during data collection for the π0 generalist policy and its successor π0.5. The π0 technical report lists ALOHA-style data among the sources for the bimanual portion of its training mix, alongside data from other dual-arm systems.
The combination of low cost, open hardware, and a growing body of pre-trained checkpoints means that a researcher can buy or build an ALOHA 2 cell, plug in a published policy, and reproduce headline tasks within days rather than months. That feedback loop is widely credited with accelerating the 2024 to 2025 wave of bimanual manipulation research.
ALOHA 2 is used by academic and industrial groups beyond the original Stanford and DeepMind teams. Stanford's IRIS lab, Berkeley's RAIL lab, MIT CSAIL, CMU, ETH Zürich, Tsinghua, Seoul National University, and the University of Tokyo all have ALOHA 2 cells in use as of 2025, according to the project page and published papers using the platform. Trossen Robotics, which sells the hardware kits, reports that the platform is also used in industry research groups at NVIDIA, Toyota Research Institute, and several Chinese robotics startups, although exact deployment numbers are not public.
The Hugging Face LeRobot project ships official drivers and example policies for ALOHA 2, which has made it easier for hobbyists and smaller labs to onboard. LeRobot's example notebooks reproduce ACT, diffusion policies, and a handful of small VLA fine-tunes on ALOHA 2 data.
ALOHA 2 sits in a distinct part of the design space from the industrial dual-arm cells that preceded it. The table below compares it with several commonly cited alternatives. Figures are taken from vendor pages, peer-reviewed papers, and project websites, and reflect publicly reported numbers rather than internal estimates.
| Platform | Year | Arms | Teleop method | Approximate hardware cost | Openness |
|---|---|---|---|---|---|
| ALOHA 2 | 2024 | 2 × ViperX 300 followers, 2 × WidowX leaders | Kinematically matched leader-follower puppeteering | Around $25,000 to $35,000 for a full cell | Open hardware, open software |
| Original ALOHA | 2023 | Same as ALOHA 2 but earlier revision | Puppeteering | Around $20,000 | Open hardware, open software |
| Mobile ALOHA | 2024 | ALOHA arms on a mobile base | Puppeteering plus pushed base | Around $32,000 | Open hardware, open software |
| Franka Emika Panda dual-arm rig | 2018 onward | 2 × Franka Panda | VR controllers, haptic phantom, or kinesthetic teaching | Roughly $60,000 to $120,000 depending on setup | Proprietary hardware, ROS drivers |
| Universal Robots UR5e dual-arm cell | Varies | 2 × UR5e | VR controllers, scripting, or kinesthetic | Roughly $80,000 to $150,000 | Proprietary hardware, open APIs |
| Tesla Optimus or Figure data collection rigs | 2024 to 2025 | Humanoid full-body | Motion capture suits or VR | Not publicly priced | Closed |
The ALOHA family trades absolute precision and payload for cost and openness. Franka and UR5 cells offer sub-millimetre repeatability and force-torque sensing out of the box, which matters for industrial tasks. ALOHA 2 relies on hobby-grade Dynamixel servos that are noticeably less stiff, but the puppeteering interface lets non-experts collect data with very little training, which is the bottleneck for imitation learning at scale.
ALOHA 2 was received warmly by the robot learning community. Reviews on the Robot Learning Workshop at NeurIPS 2024 and at CoRL 2024 frequently cited ALOHA 2 as a baseline platform for new bimanual algorithms. Several authors of competing systems, including those working on mobile humanoids and on lower-cost single-arm rigs, have publicly credited ALOHA 2 with raising expectations for what a low-cost teleoperation cell should provide.
Criticism has focused on three areas. The Dynamixel-based arms still have limited payload of roughly one kilogram per arm, which rules out heavier household tasks. The cameras and timing stack, while improved, remain less precise than industrial vision systems, which makes some millimetre-scale insertion tasks difficult. And the open hardware design is harder to procure outside the United States because of customs and shipping issues with the servos and aluminium extrusions. The ALOHA 2 authors have acknowledged these limits in talks at CoRL and ICRA, and have framed the platform as deliberately optimised for breadth of skills rather than for absolute task difficulty.
In his 2024 talks at Stanford and at Google DeepMind, Tony Z. Zhao described ALOHA 2 as a deliberate step toward making bimanual manipulation feel like a commodity research substrate, similar to how MuJoCo became a commodity for simulated control. As of 2026, that framing appears to have largely held: papers that report new bimanual policies routinely run a baseline on ALOHA 2 hardware even when their main contribution is on a different platform.