Model Predictive Control
Last reviewed
Apr 28, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 4,496 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Apr 28, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 4,496 words
Add missing citations, update stale details, or suggest a clearer explanation.
Model Predictive Control (MPC), also called receding horizon control (RHC), is a family of advanced control methods that use an explicit dynamic model of a system to predict future behavior over a finite horizon, optimize a cost function subject to constraints on inputs, states, and outputs, and apply only the first control input before re-planning at the next time step. The constant re-solution with updated measurements gives MPC its characteristic ability to handle multi-input multi-output (MIMO) plants, hard constraints, time delays, and disturbance feedforward in a unified framework. Originally developed in the late 1970s for slow chemical processes, MPC now runs at tens of hertz in passenger cars, hundreds of hertz on quadrotors and quadrupeds, and at the core of the locomotion stacks of humanoid platforms such as Atlas and Tesla Optimus.
MPC sits at the intersection of optimal control, mathematical programming, system identification, and, increasingly, machine learning. It is one of the few advanced control techniques to have crossed from academic theory into mainstream industrial practice, with thousands of installations in oil refineries, petrochemical plants, building HVAC systems, power converters, and automotive driver assistance products. Recent interest in differentiable, learning based, and reinforcement learning augmented variants has positioned MPC as a bridge between classical model based control and end to end neural policies in modern robotics and autonomous driving.
At every sampling instant, an MPC controller carries out four steps. First it measures or estimates the current state of the plant. Second it uses an internal dynamic model to predict, as a function of a candidate sequence of future inputs, how the state and outputs will evolve over a finite prediction horizon. Third it solves a numerical optimization that selects the input sequence which minimizes a cost function while respecting all constraints. Fourth it applies only the first input of that sequence, discards the rest, advances one step, and repeats. The horizon slides forward in time, which is why the technique is called receding horizon control.
This structure separates two notions often blurred in classical feedback design. The prediction horizon, usually denoted N, sets how far the controller looks into the future. The control horizon Nc sets how many distinct decision variables the optimizer is allowed to choose; after Nc, inputs are typically held constant or set to a known terminal feedback, which keeps the optimization problem bounded.
The cost function usually combines a stage cost over the prediction window with a terminal cost on the final predicted state. A common quadratic form penalizes deviations of predicted outputs from a reference trajectory and the magnitude of input moves, weighted by tunable matrices Q, R, and P. Constraints encode physical actuator limits (saturation, slew rate), safety envelopes on states (lane boundaries, joint limits, temperature ranges), and operational requirements (collision avoidance, balance margins). Because constraints are part of the problem statement, MPC handles them in a principled way that classical PID and LQR designs cannot match.
For a discrete time system with state x and input u, a generic MPC problem solved at time k can be written as the minimization, over the input sequence u_0, ..., u_{N-1}, of a sum of stage costs ell(x_i, u_i) plus a terminal cost V_f(x_N). The minimization is subject to the predicted dynamics x_{i+1} = f(x_i, u_i) starting from x_0 = x(k), input constraints u_i in U, state constraints x_i in X, and a terminal constraint x_N in X_f. After the solver returns the optimal sequence, only u_0 is applied to the plant and the procedure repeats at time k+1.
If the dynamics are linear, the cost quadratic, and the constraints polyhedral, the problem reduces to a quadratic program (QP) solved efficiently by active set, interior point, or first order methods. Nonlinear dynamics or non quadratic cost yield a nonlinear program (NLP) tackled by SQP or interior point algorithms. Hybrid systems with discrete decisions lead to mixed integer programs, more demanding and often handled with relaxations or branch and bound.
Closed loop stability is not automatic. The standard recipe, often called the Mayne ingredients after the foundational 2000 survey, uses three pieces: a terminal cost that acts as a control Lyapunov function on a terminal region, a terminal set positively invariant under a local stabilizing feedback, and a positive definite stage cost. With these ingredients and a sufficiently long prediction horizon, the optimal value function serves as a Lyapunov function for the closed loop, and recursive feasibility plus asymptotic stability can be proved rigorously.
The seeds of MPC were planted in the 1960s with the development of optimal control theory and dynamic programming, but the recognizable industrial form appeared independently in two places at the end of the 1970s. In France, Jacques Richalet, A. Rault, J. L. Testud, and J. Papon published "Model predictive heuristic control: Applications to industrial processes" in Automatica in 1978. They packaged their method in software called IDCOM (IDentification and COMmand), which used an impulse response model and a quadratic objective and was successfully deployed in distillation and superheater control.
At roughly the same time, Charles R. Cutler and B. L. Ramaker at Shell Oil in Houston presented their dynamic matrix control (DMC) algorithm at the 1980 Joint Automatic Control Conference in San Francisco. DMC used a step response model and a quadratic least squares objective over a long prediction horizon. Shell rolled DMC out internally before Cutler founded Dynamic Matrix Control Corporation, later acquired by Aspen Technology, where DMC became a flagship product still used in refineries today.
These first generation algorithms shared the receding horizon idea but lacked formal stability guarantees and could not handle inequality constraints exactly. The second generation, exemplified by Quadratic Dynamic Matrix Control (QDMC) introduced by Cutler, Morshedi, and Haydel in 1983, posed the constrained problem as an explicit quadratic program. The third generation, embodied in products such as IDCOM-M, Honeywell's RMPCT, and Aspen's DMC-plus, added systematic constraint ranking, infeasibility handling, and steady state target optimization.
In parallel, the academic community pushed toward rigorous theory. The decisive breakthrough came with Mayne, Rawlings, Rao, and Scokaert's 2000 Automatica paper "Constrained model predictive control: stability and optimality," which laid out the terminal cost, terminal constraint, and local Lyapunov function machinery that underpins essentially all modern stability proofs. Mayne's 2014 Automatica survey "Model predictive control: Recent developments and future promise" updated the picture with robust, stochastic, distributed, and economic MPC.
MPC is best thought of as a design philosophy rather than a single algorithm. Research has produced a taxonomy of variants tailored to different system classes, uncertainty models, and computational budgets.
| Variant | Model class | Main idea | Typical use |
|---|---|---|---|
| Linear MPC (LMPC) | Linear time invariant | Quadratic cost, polyhedral constraints, solved as a QP | Process industries, vehicle dynamics, power converters |
| Nonlinear MPC (NMPC) | Nonlinear ODE / DAE | Nonlinear program solved by SQP or interior point at each step | Robotics, aerospace, batch chemical reactors |
| Hybrid MPC | Mixed logical dynamical | Mixed integer programming for systems with switches and modes | Power electronics, gear shifting, traffic control |
| Robust MPC | Linear with bounded disturbance | Worst case optimization over uncertainty set | Safety critical systems with known disturbance bounds |
| Tube MPC | Nominal + bounded error tube | Plan a nominal trajectory and enforce constraints on a tightened tube around it | Vehicles and robots with model error, ancillary feedback rejects disturbance |
| Stochastic MPC (SMPC) | Stochastic disturbance | Chance constraints or expected cost | Energy systems, finance, building climate control |
| Economic MPC (EMPC) | Any | Stage cost is an arbitrary economic objective rather than tracking error | Process operation at peak profitability, smart grids |
| Explicit MPC | Linear, small state | Multi parametric QP solved offline yields piecewise affine lookup table | Embedded systems with microsecond sample times |
| Distributed MPC | Networked subsystems | Coordinated local MPCs exchange information | Smart grids, traffic networks, multi robot teams |
| Learning based MPC | Data driven model | Gaussian process or neural network model identified from data | Adaptive control of poorly modeled systems |
| Differentiable MPC | Any | Treat the MPC solver as a differentiable layer for end to end learning | Cost shaping via reinforcement learning, imitation learning |
Linear MPC remains the workhorse of industrial deployment because the underlying QP is convex, scales to hundreds of inputs, and admits warm starting between consecutive samples. Nonlinear MPC dominates in modern robotics where rigid body dynamics, contact forces, and aerodynamic effects render any linearization too inaccurate.
Robust and tube MPC address the practical challenge that the model is never perfect. Robust MPC takes a worst case view, demanding constraints hold for an entire set of possible trajectories. Pure min max formulations are usually intractable, so practitioners adopt tube MPC: a nominal MPC plans a center line trajectory while an ancillary local feedback keeps the actual state inside a precomputed tube around it. Stochastic MPC trades hard worst case guarantees for probabilistic ones via chance constraints, valuable when disturbances follow well characterized distributions, as in wind power dispatch or building energy management. Economic MPC uses the actual economic objective (profit, energy cost, yield) directly as the stage cost, unifying real time optimization with regulatory control.
Explicit MPC, introduced by Alberto Bemporad and Manfred Morari around 2000, exploits the fact that the solution of a multi parametric QP is a piecewise affine function over a polyhedral partition of the state space. The partition and gain on each region are precomputed offline and stored as a lookup table, so online operation reduces to identifying the active region and applying a matrix multiplication. This allows megahertz rates on tiny embedded controllers, but table size grows combinatorially, so explicit MPC is restricted to small problems.
A real MPC implementation must build the prediction model, formulate the QP or NLP, run a solver under hard real time deadlines, and recover from infeasibility, sensor failures, and model drift. The underlying numerical machinery has matured into a mini industry of its own.
For linear quadratic problems, the dominant solver families are interior point methods, active set methods, and first order methods such as ADMM (used in OSQP). Active set solvers are typically fastest on small problems because consecutive QPs differ only by one shifted state measurement. For nonlinear problems, the real time iteration scheme of Diehl, Bock, and Schloeder performs only one SQP step per sample, exploiting the fact that consecutive MPC problems differ only slightly so that a warm started Newton step delivers a good enough solution within the sample period. Multiple shooting partitions the horizon into intervals and treats each interval boundary state as a decision variable, producing well structured sparse NLPs.
State estimation is an inseparable companion to MPC because the controller starts every prediction from an estimate. The Kalman filter and its nonlinear extensions are the default tool, while moving horizon estimation (MHE), the dual of MPC for estimation, is increasingly used because it handles constraints on the estimated state.
A generation of open source toolchains has made MPC implementation dramatically more accessible than in the 1990s. The dominant pattern pairs a high level symbolic frontend for model description with a fast embedded solver and code generation backend.
| Tool | Type | Strengths | Notes |
|---|---|---|---|
| CasADi | Symbolic AD framework | Algorithmic differentiation, model description, ties in NLP solvers | Foundation for many higher level MPC packages |
| acados | Embedded NLP solver | Fast SQP and interior point for OCP structured problems, C code generation | Successor to ACADO Toolkit, used in racing drones, vehicles |
| ACADO Toolkit | NMPC code generator | Real time iteration, exported plain C, no runtime dependencies | Pioneering embedded NMPC tool, still in use |
| do-mpc | Python NMPC framework | Multi stage robust MPC, MHE, easy prototyping | Built on top of CasADi, popular in academia |
| OpEn | Embedded NLP solver | Proximal averaged Newton type method (PANOC), pure Rust | Targets embedded, no external dependencies |
| OSQP | Operator splitting QP solver | ADMM, very fast warm started linear MPC | Widely used in vehicles, robotics, finance |
| qpOASES | Active set QP solver | Online active set strategy designed for MPC | Strong on small to medium QPs with warm starting |
| HPIPM | Interior point QP solver | Tailored to MPC structure, dense and sparse | Often used as a backend for acados |
| MPT3 Toolbox | Explicit MPC | Multi parametric QP, polyhedral computations | MATLAB based, classic explicit MPC platform |
| Drake | Robotics planning and control | Mixed integer convex MPC, contact aware planning | Used for Atlas and other humanoids |
Real time MPC on embedded hardware has matured to the point where 100 Hz to 1 kHz update rates are routine for nonlinear problems with tens of states, and explicit MPC can reach megahertz rates for small linear problems. This computational headroom is what made MPC viable as the inner loop of dynamic robots and aggressive flight controllers.
MPC has spread from its chemical engineering origins into nearly every field involving constrained dynamic optimization. The table below summarizes representative deployment domains, dominant variant, and timescales.
| Domain | Typical variant | Sample rate | Representative use |
|---|---|---|---|
| Refining and petrochemicals | Linear or QDMC | seconds to minutes | Distillation columns, fluid catalytic crackers, ethylene plants |
| Pulp and paper | Linear MPC | seconds | Basis weight, moisture, machine direction control |
| Semiconductor manufacturing | Linear MPC, run to run | per wafer | CMP, etch, lithography overlay |
| Power electronics | Finite control set MPC | tens of kHz | Inverters, motor drives, grid connected converters |
| Building climate | Stochastic / economic MPC | minutes | HVAC scheduling, demand response, comfort vs. energy |
| Power grid | Distributed economic MPC | seconds to minutes | Voltage control, unit commitment, microgrid energy management |
| Automotive driver assistance | Linear or NMPC | 20 to 100 Hz | Adaptive cruise control, lane keeping, automated lane change |
| Autonomous driving | NMPC | 10 to 50 Hz | Lateral and longitudinal trajectory tracking, collision avoidance |
| Aerospace | NMPC | 50 to 200 Hz | Quadrotor agile flight, satellite attitude, reentry guidance |
| Quadruped locomotion | Convex / NMPC | 30 to 1000 Hz | Ground reaction force planning, footstep selection, gait switching |
| Humanoid locomotion | Whole body NMPC | 50 to 500 Hz | Walking, running, manipulation while standing |
| Surgical robotics | NMPC | 100 to 1000 Hz | Beating heart compensation, soft tissue interaction |
| Finance | Linear / stochastic MPC | minutes to days | Portfolio optimization with transaction costs |
The original and still numerically dominant home of MPC is the process industries. Crude oil distillation columns, fluid catalytic cracking units, ethylene crackers, polymer reactors, and ammonia synthesis loops typically have hundreds of measurements and manipulated variables, strong cross coupling, hard quality constraints, and economically critical operating points near constraint boundaries. MPC handles all of this in one supervisory layer atop regulatory PID loops. A 2003 survey by Qin and Badgwell estimated more than 4,500 large MPC deployments worldwide, a figure that has multiplied since.
MPC is now standard in the path tracking and motion control modules of advanced driver assistance systems and self driving prototypes. Lateral controllers use a bicycle model and NMPC to compute steering commands respecting tire grip, lane boundaries, and steering rate limits. Longitudinal controllers use NMPC over a longer horizon to choose throttle and brake commands that track speed targets, maintain safe headways, and respect torque envelopes. Coupled lateral and longitudinal MPC enables maneuvers such as automated lane changes and emergency obstacle avoidance. Production adaptive cruise control modules using MPC have shipped in millions of vehicles since the early 2010s. Waymo, Cruise, and Mobileye combine MPC with learned perception, and Tesla has described an architecture in which a high level planner uses an MPC with a one to two second horizon while end to end neural components handle higher level decisions.
Quadrotors are unstable, fast, and underactuated, an exacting benchmark for MPC. Modern academic and commercial autopilots run NMPC at 100 to 200 Hz, computing thrust and body rate commands from the rigid body dynamics. Work at the University of Zurich and ETH Zurich has demonstrated NMPC controlled quadrotors flying acrobatic loops and racing through gates at over 20 m/s. Adaptive variants couple NMPC with L1 adaptive controllers or Gaussian process model corrections to reject wind and unmodeled aerodynamic effects.
Legged locomotion is one of the highest profile recent showcases for MPC. Quadrupeds and humanoids face a hybrid control problem in which ground contact changes discretely while continuous body dynamics evolve between contacts. MPC suits this because it can plan over horizons spanning multiple contact phases and adjust on the fly.
The MIT Cheetah series, particularly the 2018 Cheetah 3 work by Di Carlo, Wensing, Katz, Bledt, and Kim, demonstrated convex MPC for quadrupedal locomotion. By approximating the robot as a single rigid body with massless legs, ground reaction force planning became a convex QP solved in under one millisecond on commodity hardware. This made convex MPC the default architecture for dynamic quadruped locomotion. The ETH Zurich team led by Marco Hutter extended these ideas with nonlinear MPC for the ANYmal series, including perceptive locomotion that integrates terrain elevation maps directly into the NMPC for traversing rough terrain. Reinforcement learning policies trained in simulation now share the locomotion stack with MPC, with MPC supplying structured dynamic priors and the learned policy providing robustness to sim to real gaps.
For humanoids, Boston Dynamics's Atlas has long combined model predictive trajectory optimization, mixed integer footstep planning, and whole body quadratic programming, with the MPC layer planning long horizon center of mass trajectories while a fast inverse dynamics layer translates them into joint torques. Recent demonstrations of large behavior models on Atlas couple this MPC backbone with learned policies for perception heavy manipulation. Tesla Optimus employs a hierarchical stack: a high level MPC plans footfalls and overall trajectories, a mid level balance layer enforces dynamic stability with a QP solver, and low level joint controllers track torque commands at one to two kilohertz over an EtherCAT fieldbus.
The relationship between MPC and machine learning has evolved through three overlapping waves. The first treats learning as a tool to identify or correct the prediction model used by an otherwise classical MPC. The second uses MPC as a building block inside a learning architecture, often as a structured policy class in reinforcement learning. The third trains a neural network to approximate the MPC mapping from state to optimal input, amortizing the online optimization cost.
Learning based MPC in the first sense replaces or augments first principles models with Gaussian process regression, neural network models, sparse identification of nonlinear dynamics (SINDy), or Koopman operator approximations. Hard to model effects (friction stiction, aerodynamic interaction, soft tissue contact) can be captured directly from operating data without abandoning the stability machinery of MPC. Practical implementations propagate uncertainty estimates from the learned model into the MPC formulation.
Differentiable MPC, introduced in a 2018 NeurIPS paper by Brandon Amos, Ivan Jimenez, Jacob Sacks, Byron Boots, and J. Zico Kolter, treats the entire MPC controller as a differentiable layer whose inputs are cost weights and dynamics parameters and whose output is the optimal first input. Differentiation is performed through the KKT conditions, so MPC parameters can be trained end to end by backpropagating from a downstream task loss, mixing the inductive bias of model based control with the flexibility of machine learning.
MPC also serves as the planner inside model based reinforcement learning. The Model Predictive Path Integral approach (MPPI) samples many random input trajectories, scores them with a learned dynamics model, and weighs them softmax style to obtain a control update, providing a derivative free MPC variant parallelizable on GPUs. The MuZero family from DeepMind, often presented as a reinforcement learning algorithm, can be viewed as MPC in which a learned latent model is unrolled by Monte Carlo tree search. Other hybrid architectures use MPC as a short horizon safety filter that overrides an RL policy whenever it would violate constraints.
MPC's appeal stems from structural advantages over classical control. It handles MIMO systems and constraints in one unified design without ad hoc gain scheduling or anti windup logic. It incorporates feedforward information automatically because future references and known disturbances enter the optimization directly. It can optimize an essentially arbitrary cost function, letting engineers express trade offs explicitly. And it has a clear conceptual model that maps onto how humans plan.
The price is computational, modeling, and analytical complexity. MPC requires a model accurate enough over the prediction horizon. It demands an online optimization at every sample, straining real time computing resources for nonlinear and large scale problems. Stability and recursive feasibility require careful design of terminal cost, terminal set, and horizon length, with formal guarantees fragile under model mismatch. Tuning cost weights and constraint slack remains more art than science. In practice, deployment risk is managed by extensive simulation, hardware in the loop testing, conservative constraint margins, fallback PID controllers for solver failure, and monitoring of solve times on the running system.
MPC sits in a family of overlapping techniques. Linear quadratic regulator (LQR) is the special case where the system is linear, the cost quadratic, the horizon infinite, and there are no constraints; MPC then reduces to a state feedback gain. Adding constraints is what makes MPC strictly more expressive than LQR. Dynamic programming and the Hamilton Jacobi Bellman (HJB) equation provide the broader optimal control framework that MPC tractably approximates; solving HJB directly suffers the curse of dimensionality, while MPC sidesteps this by recomputing locally at each sample. Reinforcement learning, particularly model based RL, can be viewed as a stochastic approximation of dynamic programming that learns a value function or policy from interaction. Practical systems increasingly use both, with MPC handling fast inner loop control under tight constraints and learned policies handling high level decisions.
Research on MPC remains exceptionally active in the mid 2020s, driven by the appetite for embodied AI in robotics, the compute that lets neural surrogates contribute meaningfully to control, and safety pressure pushing formal methods into applications previously owned by handcrafted controllers. Noteworthy current trends include real time NMPC at kilohertz rates on embedded GPUs, contact implicit MPC for legged locomotion that does not require a precomputed contact schedule, large vision language models as planners supplying targets to MPC controllers, neural Lyapunov design that learns terminal cost ingredients automatically, and safety filtering frameworks wrapping learned policies in MPC for certification. Standard textbooks by Camacho and Bordons, Rawlings, Mayne and Diehl, and Borrelli, Bemporad and Morari remain the canonical references.