Whole-body control
Last reviewed
May 1, 2026
Sources
21 citations
Review status
Source-backed
Revision
v1 · 3,986 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 1, 2026
Sources
21 citations
Review status
Source-backed
Revision
v1 · 3,986 words
Add missing citations, update stale details, or suggest a clearer explanation.
Whole-body control (WBC) is a class of robotics control techniques for highly redundant robots, typically humanoid robots and legged platforms, that simultaneously regulates many objectives across the entire body, such as balance, end-effector tracking, contact forces, posture, and joint limits. A WBC formulates control as a constrained optimisation problem solved at high frequency, usually 500 Hz to 1 kHz, and produces consistent joint torques or accelerations that respect the equations of motion, contact constraints, actuator bounds, and a designer-specified hierarchy of tasks.
The approach was crystallised by Oussama Khatib's operational space formulation in 1987 and extended by Luis Sentis and Khatib's 2005 paper on hierarchical control of behavioural primitives, which is widely credited as the work that introduced the term whole-body control in its modern sense. Since then it has become the dominant low-level control paradigm in academic and commercial humanoid robotics, used on platforms such as the Boston Dynamics Atlas, the DLR TORO, the AIST HRP series, the IHMC entry to the DARPA Robotics Challenge, the PAL Robotics TALOS, the Agility Robotics Cassie and Digit, and most recent commercial humanoids including Figure 02 (whose Helix 02 system relies on a kilohertz whole-body layer) and 1X NEO.
A modern humanoid has 30 to 60 actuated joints, makes and breaks several contacts with the environment per second, and must satisfy strict physical limits while pursuing multiple competing goals. A naive controller that only commands one task at a time, for example a Cartesian end-effector trajectory, would either ignore balance, or produce joint commands that violate friction cones, torque limits, or self-collision constraints. Whole-body control was developed to solve this systems problem in one consistent computation.
WBC matters because:
WBC sits between higher-level planners (model predictive control, footstep planners, behaviour models, vision-language-action policies) and the low-level joint controllers. The planner picks contact schedules, footholds, and task references; the WBC turns those into instantaneous joint torques that respect physics.
Khatib's 1987 paper introduced the operational space formulation, a way to derive the dynamics of a manipulator projected into the task space (Cartesian coordinates of the end effector) instead of the joint space. Writing the equations of motion at the task level allows a designer to specify behaviour directly in the space the user cares about, while a posture controller acts in the null space of the task without disturbing it. This decoupling is the conceptual ancestor of every later WBC method.
Real robots cannot satisfy every objective at once, so tasks must be ranked. A typical humanoid hierarchy puts dynamic feasibility (equations of motion, contact constraints) at the top, then balance (centre of mass and angular momentum tracking), then end-effector tracking, then a low-priority posture cost that regularises the joint configuration. Lower-priority tasks are projected into the null space of higher-priority ones, so they only use the freedom that remains after the higher tasks are satisfied. Sentis and Khatib's 2005 paper formalised this hierarchy for humanoids and showed how to compose behavioural primitives such as centre-of-mass control, hand control, and posture control into coherent whole-body behaviours.
For a humanoid the configuration vector q is the joint vector augmented with a free-floating base, usually represented as a position in R^3 and an orientation in SO(3). The base has no actuator, so the actuation selection matrix S in the equations of motion has a six-row block of zeros corresponding to the base. This underactuation is what makes the contact forces λ load-bearing in a literal sense: only the contact wrenches can move the centre of mass, so the controller must allocate them carefully.
Feet, hands, and any other contact point are subject to:
The dynamics of a floating-base robot in contact are
M(q) v̇ + h(q, v) = S^T τ + J_c(q)^T λ
where M(q) is the joint-space mass matrix, h(q, v) collects Coriolis, centrifugal, and gravity terms, S is the actuation selection matrix that picks out the actuated joints, τ is the vector of joint torques, J_c(q) is the contact Jacobian stacking the Jacobians of every contact point, and λ is the vector of contact forces (or wrenches for surface contacts). The contact forces must lie in their friction cones, and on contact points that do not move, J_c v̇ + J̇_c v = 0.
Most modern WBCs solve, at every control tick, a quadratic program of the form
minimise ||A x − b||²
subject to M v̇ + h = S^T τ + J_c^T λ
J_c v̇ + J̇_c v = 0
λ ∈ friction cone
τ_min ≤ τ ≤ τ_max
v̇_min ≤ v̇ ≤ v̇_max
where the decision variable x = [v̇; τ; λ] stacks joint accelerations, joint torques, and contact forces. The cost ||A x − b||² aggregates the desired tasks, for example tracking a centre-of-mass trajectory, tracking an end-effector trajectory, regularising the posture, and minimising joint torques. Different rows of A correspond to different tasks, often weighted by hand-tuned factors.
The QP is sparse, convex, and well posed for typical humanoids. Specialised active-set solvers exploit the structure to run at 1 kHz on a 34 to 60 DoF robot, as demonstrated by IHMC's controller for Atlas in the DARPA Robotics Challenge.
Weighting tasks against each other is convenient but fragile because a tiny weight on a soft constraint can still influence a critical hard task. Hierarchical quadratic programming (HQP) replaces the single weighted sum by a cascade of QPs: solve the highest-priority QP, fix its slack, solve the next one in the null space of the first, and so on. The HQP solvers of Saab et al. (2013) and Escande, Mansard and Wieber (2014) introduced fast active-set methods that handle equality and inequality constraints at any priority level and run in real time on humanoids such as HRP-2.
David Orin and Ambarish Goswami's 2008 paper on the centroidal momentum matrix gave WBC researchers a clean way to write linear and angular momentum at the centre of mass as a linear function of joint velocities. Most modern WBCs include centroidal momentum tracking as a top-priority task, because regulating angular momentum around the centre of mass is what keeps a humanoid upright when it is pushed or when it swings its arms.
| Formulation | Year | Key idea | Representative author |
|---|---|---|---|
| Operational Space Formulation (OSF) | 1987 | Project dynamics into Cartesian task space; control end effector with decoupled task and posture loops | Khatib |
| Hierarchical Operational Space Control | 2005 | Compose behavioural primitives with null-space projections to form whole-body behaviours | Sentis and Khatib |
| Cartesian Impedance Control | 2007 | Passivity-based shaping of stiffness and damping at the end effector for torque-controlled flexible-joint robots | Albu-Schaffer, Ott, Hirzinger |
| Centroidal Momentum Control | 2008 onward | Use the centroidal momentum matrix to regulate linear and angular momentum at the CoM | Orin, Goswami |
| QP-based whole-body control | 2010 | Cast control as a single quadratic program over accelerations, torques, and contact forces | Stephens and Atkeson |
| Hierarchical QP (HQP) | 2013 to 2014 | Cascade of QPs respecting task priority with both equality and inequality constraints | Saab, Escande, Mansard, Wieber |
| Inverse Dynamics with Constraints (IDC) | 2010s | Solve constrained inverse dynamics directly for torques given desired accelerations | Righetti, Schaal, others |
| Passivity-based whole-body balancing | 2016 | Passive, compliant CoM and end-effector regulation in multi-contact, applied on DLR TORO | Henze, Roa, Ott |
| Convex whole-body control | 2018 onward | Reformulations and relaxations that yield convex problems amenable to fast off-the-shelf solvers | Carpentier, Mastalli and others |
| Differential Dynamic Programming (DDP) variants | 2010s onward | Trajectory-optimisation style WBC that exploits problem structure (FDDP, multiple shooting) | Tassa, Mastalli (Crocoddyl) |
| Library | Origin | Role |
|---|---|---|
| Stack of Tasks | LAAS-CNRS | C++ framework for hierarchical task-space control with a scripting front end |
| Pinocchio | INRIA Willow, LAAS Gepetto | Rigid-body dynamics and analytical derivatives, the de facto computational core of modern WBC pipelines |
| TSID | Andrea Del Prete and collaborators | Task-space inverse dynamics built on Pinocchio, hierarchical least squares solver, examples for manipulators, humanoids and quadrupeds |
| Crocoddyl | LAAS-CNRS, INRIA, Heriot-Watt | Multi-contact optimal control library based on differential dynamic programming, used for highly dynamic legged behaviours |
| Drake | MIT and Toyota Research Institute | Model-based design and verification toolbox with QP-WBC, MPC, and high-fidelity contact simulation |
| OCS2 | ETH Zurich Robotic Systems Lab | Real-time MPC for switched systems, used on ANYmal C and ANYmal C with arm |
| HOQP | LAAS-CNRS and others | Hierarchical QP solver implementations used inside SoT and TSID-style stacks |
| WB-MPC | Patrick Wensing and collaborators | Whole-body MPC formulations for legged robots |
| Eiquadprog and qpOASES | various | General-purpose QP solvers used as the inner kernel of many WBCs |
| PyBullet, MuJoCo, Isaac Sim | various | Simulators used to test WBCs and to train RL policies that interact with WBC layers |
| NVIDIA Isaac Lab | NVIDIA | GPU-accelerated RL training environment, often used for learning policies that sit above or alongside a WBC |
| Task | What it specifies |
|---|---|
| Centre-of-mass tracking | Position, velocity, and acceleration of the CoM, the primary balance objective |
| Linear momentum | Total mass times CoM velocity, regulated for push recovery and walking |
| Angular momentum about the CoM | Whole-body angular momentum, key to keeping torso upright when the arms swing |
| End-effector position and orientation | 6 DoF pose for hands or any other body that interacts with the environment |
| Foot trajectory tracking | 6 DoF pose for swing feet during stepping |
| Joint position limits | Hard inequality constraints to avoid mechanical end stops |
| Joint velocity limits | Hard inequality constraints to respect actuator and gearbox bounds |
| Joint torque limits | Hard inequality constraints to respect actuator saturation |
| Friction cone constraints | Tangential force at each contact bounded by μ times normal force |
| Contact wrench feasibility | Centre of pressure inside the support polygon, normal force non-negative |
| Self-collision avoidance | Distance constraints between body parts, often as inequalities or barrier costs |
| Posture cost | Low-priority regularisation toward a nominal joint configuration |
| Torque minimisation | Quadratic cost on τ to choose among the infinitely many feasible solutions |
The rise of large-scale imitation learning and reinforcement learning has pushed WBC from being the only credible approach to humanoid control to being one component in a hybrid stack.
| Property | Model-based WBC | Reinforcement learning |
|---|---|---|
| Model required | Accurate rigid-body dynamics, contact models, friction estimates | Simulator (sometimes inaccurate) plus reward function |
| Data required | Almost none beyond identification | Often hundreds of millions of simulated steps |
| Real-time guarantees | Strong (sparse QP at known frequency) | Weak; depends on network size and inference budget |
| Handling of unmodelled effects | Poor unless explicitly modelled | Strong if seen during training |
| Constraint satisfaction | Hard constraints enforced by the QP | Soft, expressed as reward shaping |
| Composability | Add or remove tasks on the fly | Retraining usually required |
| Maturity | 30 plus years of theory, on hardware since the 2000s | Rapid progress since around 2018 |
| Typical failure mode | Falls when the model is wrong | Falls when the deployment differs from training |
In practice modern humanoids combine the two. Common patterns:
WBC has powered some of the most visible humanoid demonstrations of the last decade. The Atlas parkour, dance, and gymnastics videos from Boston Dynamics rely on a model-based whole-body controller underneath an offline trajectory optimiser. The DARPA Robotics Challenge Finals in 2015, where humanoid robots had to drive vehicles, open doors, and turn valves under semi-autonomous control, was a coming-out party for QP-based WBC; both MIT and IHMC entered Atlas robots running QP whole-body controllers, with IHMC placing second using a momentum-based WBC. ANYmal has performed loco-manipulation tasks such as opening doors and carrying objects with an OCS2 MPC stack on top of a tracking controller. LAAS-CNRS, IIT, and other groups have demonstrated multi-contact climbing where the robot uses arms and legs simultaneously, a regime where contact scheduling and friction-cone reasoning are essential and where WBC is the natural framework. Cassie's 100 m world record on May 11, 2022 demonstrated that fully learned policies can reach the dynamic regimes that WBC theory was designed for, while also showing that the two paradigms are converging rather than competing.
A modern legged or humanoid stack typically has at least three layers running at different frequencies:
This division of labour is now standard. The MPC chooses what the robot will do over the next half second; the WBC makes sure the robot does not violate physics in the next millisecond. Where a learned policy is added it usually replaces or augments the planning or MPC layer, while the WBC layer remains.
WBC is powerful but has well-known weaknesses. Solving a QP with hundreds of variables at a kilohertz is computationally demanding, especially on embedded compute, and the solver must be both fast and reliable; warm starts, sparsity exploitation, and active-set methods are essential. The controller depends on an accurate dynamic model: mass, inertia, joint friction, motor torque constants, and contact friction all enter the equations of motion, and small modelling errors compound. Friction cones in particular are conservative; a polyhedral approximation can either be overly cautious (too few edges) or expensive to solve (too many edges). Hand-designed task hierarchies are brittle, since adding a new behaviour often requires retuning weights and priorities. Finally, WBC handles small disturbances well but struggles with large unmodelled events such as a missed footfall on rough terrain or a contact that breaks at the wrong moment, which is one reason RL-based recovery policies are increasingly common.
Research around 2024 to 2026 has focused on integrating WBC more tightly with learning. Differentiable whole-body controllers expose gradients of the optimal solution with respect to costs and constraints, which enables end-to-end learning of task weights and reference trajectories. Convex relaxations of the contact problem, including second-order-cone formulations of friction and convex MPC variants, have made larger problems tractable in real time. Learned contact models replace the rigid friction cone with a network that captures slip, deformation, and surface variation. On the deployment side, commercial humanoid programs at Boston Dynamics, Figure, 1X, Apptronik and Tesla are converging on architectures that combine a vision-language reasoner, a learned whole-body policy, and a kilohertz model-based WBC layer, which has restored interest in formal stability analysis, safety filters, and constraint-respecting RL on top of WBC.