Model Predictive Control

Robotics Training & Optimization

24 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

24 citations

Revision

v3 · 4,823 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Model Predictive Control (MPC), also called receding horizon control (RHC), is a feedback control method that, at every sampling instant, solves a finite horizon optimal control problem using the current measured state as the initial condition, applies only the first control input, and then re-solves the problem one step later as the horizon slides forward.^[3]^[4] This receding horizon loop lets MPC enforce hard constraints on inputs, states, and outputs while optimizing an arbitrary cost function, which is why it is the dominant advanced control method in industry and a fast growing tool in modern robotics, autonomous driving, and learning based control.^[5]^[22] The 2000 survey by Mayne and colleagues, the most cited reference in the field, defines it as control "in which the current control action is obtained by solving, at each sampling instant, a finite horizon open-loop optimal control problem, using the current state of the plant as the initial state."^[4]

MPC originated in the late 1970s for slow chemical processes and now runs at tens of hertz in passenger cars, hundreds of hertz on quadrotors and quadrupeds, and at the core of the locomotion stacks of humanoid platforms such as Atlas and Tesla Optimus. By 2003, a survey by Qin and Badgwell counted more than 4,600 large MPC applications operating worldwide, spanning refining, petrochemicals, pulp and paper, and food processing, a figure that has multiplied since.^[5] MPC sits at the intersection of optimal control, mathematical programming, system identification, and, increasingly, machine learning. Recent interest in differentiable, learning based, and reinforcement learning augmented variants has positioned MPC as a bridge between classical model based control and end to end neural policies, and as a planning primitive closely related to world model based agents.^[22]

What is the core idea behind MPC?

At every sampling instant, an MPC controller carries out four steps. First it measures or estimates the current state of the plant. Second it uses an internal dynamic model to predict, as a function of a candidate sequence of future inputs, how the state and outputs will evolve over a finite prediction horizon. Third it solves a numerical optimization that selects the input sequence which minimizes a cost function while respecting all constraints. Fourth it applies only the first input of that sequence, discards the rest, advances one step, and repeats. The horizon slides forward in time, which is why the technique is called receding horizon control.^[3]

This structure separates two notions often blurred in classical feedback design. The prediction horizon, usually denoted N, sets how far the controller looks into the future. The control horizon Nc sets how many distinct decision variables the optimizer is allowed to choose; after Nc, inputs are typically held constant or set to a known terminal feedback, which keeps the optimization problem bounded.^[8]

The cost function usually combines a stage cost over the prediction window with a terminal cost on the final predicted state. A common quadratic form penalizes deviations of predicted outputs from a reference trajectory and the magnitude of input moves, weighted by tunable matrices Q, R, and P. Constraints encode physical actuator limits (saturation, slew rate), safety envelopes on states (lane boundaries, joint limits, temperature ranges), and operational requirements (collision avoidance, balance margins). Because constraints are part of the problem statement, MPC handles them in a principled way that classical PID and LQR designs cannot match.^[8]

Mathematical Formulation

For a discrete time system with state x and input u, a generic MPC problem solved at time k can be written as the minimization, over the input sequence u_0, ..., u_{N-1}, of a sum of stage costs ell(x_i, u_i) plus a terminal cost V_f(x_N). The minimization is subject to the predicted dynamics x_{i+1} = f(x_i, u_i) starting from x_0 = x(k), input constraints u_i in U, state constraints x_i in X, and a terminal constraint x_N in X_f. After the solver returns the optimal sequence, only u_0 is applied to the plant and the procedure repeats at time k+1.^[9]

If the dynamics are linear, the cost quadratic, and the constraints polyhedral, the problem reduces to a quadratic program (QP) solved efficiently by active set, interior point, or first order methods. Nonlinear dynamics or non quadratic cost yield a nonlinear program (NLP) tackled by SQP or interior point algorithms. Hybrid systems with discrete decisions lead to mixed integer programs, more demanding and often handled with relaxations or branch and bound.^[10]

Closed loop stability is not automatic. The standard recipe, often called the Mayne ingredients after the foundational 2000 survey, uses three pieces: a terminal cost that acts as a control Lyapunov function on a terminal region, a terminal set positively invariant under a local stabilizing feedback, and a positive definite stage cost.^[4] With these ingredients and a sufficiently long prediction horizon, the optimal value function serves as a Lyapunov function for the closed loop, and recursive feasibility plus asymptotic stability can be proved rigorously.^[4]

When did MPC originate?

The seeds of MPC were planted in the 1960s with the development of optimal control theory and dynamic programming, but the recognizable industrial form appeared independently in two places at the end of the 1970s. In France, Jacques Richalet, A. Rault, J. L. Testud, and J. Papon published "Model predictive heuristic control: Applications to industrial processes" in Automatica in 1978.^[1] They packaged their method in software called IDCOM (IDentification and COMmand), which used an impulse response model and a quadratic objective and was successfully deployed in distillation and superheater control.^[1]

At roughly the same time, Charles R. Cutler and B. L. Ramaker at Shell Oil in Houston presented their dynamic matrix control (DMC) algorithm at the 1980 Joint Automatic Control Conference in San Francisco.^[2] DMC used a step response model and a quadratic least squares objective over a long prediction horizon.^[2] Shell rolled DMC out internally before Cutler founded Dynamic Matrix Control Corporation, later acquired by Aspen Technology, where DMC became a flagship product still used in refineries today.

These first generation algorithms shared the receding horizon idea but lacked formal stability guarantees and could not handle inequality constraints exactly. The second generation, exemplified by Quadratic Dynamic Matrix Control (QDMC) introduced by Cutler, Morshedi, and Haydel in 1983, posed the constrained problem as an explicit quadratic program.^[5] The third generation, embodied in products such as IDCOM-M, Honeywell's RMPCT, and Aspen's DMC-plus, added systematic constraint ranking, infeasibility handling, and steady state target optimization.^[5]

In parallel, the academic community pushed toward rigorous theory. The decisive breakthrough came with Mayne, Rawlings, Rao, and Scokaert's 2000 Automatica paper "Constrained model predictive control: stability and optimality," which laid out the terminal cost, terminal constraint, and local Lyapunov function machinery that underpins essentially all modern stability proofs.^[4] Mayne's 2014 Automatica survey "Model predictive control: Recent developments and future promise" updated the picture with robust, stochastic, distributed, and economic MPC.^[11]

What are the main variants of MPC?

MPC is best thought of as a design philosophy rather than a single algorithm. Research has produced a taxonomy of variants tailored to different system classes, uncertainty models, and computational budgets.

Variant	Model class	Main idea	Typical use
Linear MPC (LMPC)	Linear time invariant	Quadratic cost, polyhedral constraints, solved as a QP	Process industries, vehicle dynamics, power converters
Nonlinear MPC (NMPC)	Nonlinear ODE / DAE	Nonlinear program solved by SQP or interior point at each step	Robotics, aerospace, batch chemical reactors
Hybrid MPC	Mixed logical dynamical	Mixed integer programming for systems with switches and modes	Power electronics, gear shifting, traffic control
Robust MPC	Linear with bounded disturbance	Worst case optimization over uncertainty set	Safety critical systems with known disturbance bounds
Tube MPC	Nominal + bounded error tube	Plan a nominal trajectory and enforce constraints on a tightened tube around it	Vehicles and robots with model error, ancillary feedback rejects disturbance
Stochastic MPC (SMPC)	Stochastic disturbance	Chance constraints or expected cost	Energy systems, finance, building climate control
Economic MPC (EMPC)	Any	Stage cost is an arbitrary economic objective rather than tracking error	Process operation at peak profitability, smart grids
Explicit MPC	Linear, small state	Multi parametric QP solved offline yields piecewise affine lookup table	Embedded systems with microsecond sample times
Distributed MPC	Networked subsystems	Coordinated local MPCs exchange information	Smart grids, traffic networks, multi robot teams
Learning based MPC	Data driven model	Gaussian process or neural network model identified from data	Adaptive control of poorly modeled systems
Differentiable MPC	Any	Treat the MPC solver as a differentiable layer for end to end learning	Cost shaping via reinforcement learning, imitation learning

Linear MPC remains the workhorse of industrial deployment because the underlying QP is convex, scales to hundreds of inputs, and admits warm starting between consecutive samples. Nonlinear MPC dominates in modern robotics where rigid body dynamics, contact forces, and aerodynamic effects render any linearization too inaccurate.

Robust and tube MPC address the practical challenge that the model is never perfect. Robust MPC takes a worst case view, demanding constraints hold for an entire set of possible trajectories. Pure min max formulations are usually intractable, so practitioners adopt tube MPC: a nominal MPC plans a center line trajectory while an ancillary local feedback keeps the actual state inside a precomputed tube around it.^[11] Stochastic MPC trades hard worst case guarantees for probabilistic ones via chance constraints, valuable when disturbances follow well characterized distributions, as in wind power dispatch or building energy management.^[11] Economic MPC uses the actual economic objective (profit, energy cost, yield) directly as the stage cost, unifying real time optimization with regulatory control.^[11]

Explicit MPC, introduced by Alberto Bemporad and Manfred Morari around 2000, exploits the fact that the solution of a multi parametric QP is a piecewise affine function over a polyhedral partition of the state space.^[6] The partition and gain on each region are precomputed offline and stored as a lookup table, so online operation reduces to identifying the active region and applying a matrix multiplication.^[6] This allows megahertz rates on tiny embedded controllers, but table size grows combinatorially, so explicit MPC is restricted to small problems.^[6]

Implementation and Numerical Methods

A real MPC implementation must build the prediction model, formulate the QP or NLP, run a solver under hard real time deadlines, and recover from infeasibility, sensor failures, and model drift. The underlying numerical machinery has matured into a mini industry of its own.

For linear quadratic problems, the dominant solver families are interior point methods, active set methods, and first order methods such as ADMM (used in OSQP).^[15] Active set solvers are typically fastest on small problems because consecutive QPs differ only by one shifted state measurement. For nonlinear problems, the real time iteration scheme of Diehl, Bock, and Schloeder performs only one SQP step per sample, exploiting the fact that consecutive MPC problems differ only slightly so that a warm started Newton step delivers a good enough solution within the sample period.^[7] Multiple shooting partitions the horizon into intervals and treats each interval boundary state as a decision variable, producing well structured sparse NLPs.^[7]

State estimation is an inseparable companion to MPC because the controller starts every prediction from an estimate. The Kalman filter and its nonlinear extensions are the default tool, while moving horizon estimation (MHE), the dual of MPC for estimation, is increasingly used because it handles constraints on the estimated state.^[9]

What software is used to implement MPC?

A generation of open source toolchains has made MPC implementation dramatically more accessible than in the 1990s. The dominant pattern pairs a high level symbolic frontend for model description with a fast embedded solver and code generation backend.

Tool	Type	Strengths	Notes
CasADi	Symbolic AD framework	Algorithmic differentiation, model description, ties in NLP solvers	Foundation for many higher level MPC packages
acados	Embedded NLP solver	Fast SQP and interior point for OCP structured problems, C code generation	Successor to ACADO Toolkit, used in racing drones, vehicles
ACADO Toolkit	NMPC code generator	Real time iteration, exported plain C, no runtime dependencies	Pioneering embedded NMPC tool, still in use
do-mpc	Python NMPC framework	Multi stage robust MPC, MHE, easy prototyping	Built on top of CasADi, popular in academia
OpEn	Embedded NLP solver	Proximal averaged Newton type method (PANOC), pure Rust	Targets embedded, no external dependencies
OSQP	Operator splitting QP solver	ADMM, very fast warm started linear MPC	Widely used in vehicles, robotics, finance
qpOASES	Active set QP solver	Online active set strategy designed for MPC	Strong on small to medium QPs with warm starting
HPIPM	Interior point QP solver	Tailored to MPC structure, dense and sparse	Often used as a backend for acados
MPT3 Toolbox	Explicit MPC	Multi parametric QP, polyhedral computations	MATLAB based, classic explicit MPC platform
Drake	Robotics planning and control	Mixed integer convex MPC, contact aware planning	Used for Atlas and other humanoids

Real time MPC on embedded hardware has matured to the point where 100 Hz to 1 kHz update rates are routine for nonlinear problems with tens of states, and explicit MPC can reach megahertz rates for small linear problems.^[13] This computational headroom is what made MPC viable as the inner loop of dynamic robots and aggressive flight controllers.

What is MPC used for? Applications by domain

MPC has spread from its chemical engineering origins into nearly every field involving constrained dynamic optimization. The table below summarizes representative deployment domains, dominant variant, and timescales.

Domain	Typical variant	Sample rate	Representative use
Refining and petrochemicals	Linear or QDMC	seconds to minutes	Distillation columns, fluid catalytic crackers, ethylene plants
Pulp and paper	Linear MPC	seconds	Basis weight, moisture, machine direction control
Semiconductor manufacturing	Linear MPC, run to run	per wafer	CMP, etch, lithography overlay
Power electronics	Finite control set MPC	tens of kHz	Inverters, motor drives, grid connected converters
Building climate	Stochastic / economic MPC	minutes	HVAC scheduling, demand response, comfort vs. energy
Power grid	Distributed economic MPC	seconds to minutes	Voltage control, unit commitment, microgrid energy management
Automotive driver assistance	Linear or NMPC	20 to 100 Hz	Adaptive cruise control, lane keeping, automated lane change
Autonomous driving	NMPC	10 to 50 Hz	Lateral and longitudinal trajectory tracking, collision avoidance
Aerospace	NMPC	50 to 200 Hz	Quadrotor agile flight, satellite attitude, reentry guidance
Quadruped locomotion	Convex / NMPC	30 to 1000 Hz	Ground reaction force planning, footstep selection, gait switching
Humanoid locomotion	Whole body NMPC	50 to 500 Hz	Walking, running, manipulation while standing
Surgical robotics	NMPC	100 to 1000 Hz	Beating heart compensation, soft tissue interaction
Finance	Linear / stochastic MPC	minutes to days	Portfolio optimization with transaction costs

Process Industries

The original and still numerically dominant home of MPC is the process industries. Crude oil distillation columns, fluid catalytic cracking units, ethylene crackers, polymer reactors, and ammonia synthesis loops typically have hundreds of measurements and manipulated variables, strong cross coupling, hard quality constraints, and economically critical operating points near constraint boundaries. MPC handles all of this in one supervisory layer atop regulatory PID loops. The 2003 survey by Qin and Badgwell estimated more than 4,600 large MPC deployments worldwide, dominated by refining and petrochemicals, a figure that has multiplied since.^[5]

Automotive and Autonomous Driving

MPC is now standard in the path tracking and motion control modules of advanced driver assistance systems and self driving prototypes. Lateral controllers use a bicycle model and NMPC to compute steering commands respecting tire grip, lane boundaries, and steering rate limits. Longitudinal controllers use NMPC over a longer horizon to choose throttle and brake commands that track speed targets, maintain safe headways, and respect torque envelopes. Coupled lateral and longitudinal MPC enables maneuvers such as automated lane changes and emergency obstacle avoidance. Production adaptive cruise control modules using MPC have shipped in millions of vehicles since the early 2010s. Waymo, Cruise, and Mobileye combine MPC with learned perception, and Tesla has described an architecture in which a high level planner uses an MPC with a one to two second horizon while end to end neural components handle higher level decisions.

Drones and Aerial Robotics

Quadrotors are unstable, fast, and underactuated, an exacting benchmark for MPC. Modern academic and commercial autopilots run NMPC at 100 to 200 Hz, computing thrust and body rate commands from the rigid body dynamics. Work at the University of Zurich and ETH Zurich has demonstrated NMPC controlled quadrotors flying acrobatic loops and racing through gates at over 20 m/s. Adaptive variants couple NMPC with L1 adaptive controllers or Gaussian process model corrections to reject wind and unmodeled aerodynamic effects.

How is MPC used in legged robots?

Legged locomotion is one of the highest profile recent showcases for MPC. Quadrupeds and humanoids face a hybrid control problem in which ground contact changes discretely while continuous body dynamics evolve between contacts. MPC suits this because it can plan over horizons spanning multiple contact phases and adjust on the fly.^[19]

The MIT Cheetah series, particularly the 2018 Cheetah 3 work by Di Carlo, Wensing, Katz, Bledt, and Kim, demonstrated convex MPC for quadrupedal locomotion.^[17] By approximating the robot as a single rigid body with massless legs, ground reaction force planning over horizons of up to about 0.5 seconds became a convex QP solved to optimality in under one millisecond at roughly 20 to 30 Hz on commodity hardware.^[17] This made convex MPC the default architecture for dynamic quadruped locomotion. The ETH Zurich team led by Marco Hutter extended these ideas with nonlinear MPC for the ANYmal series, including perceptive locomotion that integrates terrain elevation maps directly into the NMPC for traversing rough terrain.^[18] Reinforcement learning policies trained in simulation now share the locomotion stack with MPC, with MPC supplying structured dynamic priors and the learned policy providing robustness to sim to real gaps.

For humanoids, Boston Dynamics's Atlas has long combined model predictive trajectory optimization, mixed integer footstep planning, and whole body quadratic programming, with the MPC layer planning long horizon center of mass trajectories while a fast inverse dynamics layer translates them into joint torques.^[19] Recent demonstrations of large behavior models on Atlas couple this MPC backbone with learned policies for perception heavy manipulation. Tesla Optimus employs a hierarchical stack: a high level MPC plans footfalls and overall trajectories, a mid level balance layer enforces dynamic stability with a QP solver, and low level joint controllers track torque commands at one to two kilohertz over an EtherCAT fieldbus.

How does MPC relate to reinforcement learning and world models?

The relationship between MPC and machine learning has evolved through three overlapping waves. The first treats learning as a tool to identify or correct the prediction model used by an otherwise classical MPC. The second uses MPC as a building block inside a learning architecture, often as a structured policy class in reinforcement learning. The third trains a neural network to approximate the MPC mapping from state to optimal input, amortizing the online optimization cost.^[22] As Hewing and colleagues put it in their 2020 Annual Review, learning based MPC seeks to "exploit the abundance of data in a reliable manner" while "taking safety constraints into account."^[22]

Learning based MPC in the first sense replaces or augments first principles models with Gaussian process regression, neural network models, sparse identification of nonlinear dynamics (SINDy), or Koopman operator approximations. Hard to model effects (friction stiction, aerodynamic interaction, soft tissue contact) can be captured directly from operating data without abandoning the stability machinery of MPC. Practical implementations propagate uncertainty estimates from the learned model into the MPC formulation.^[22]

Differentiable MPC, introduced in a 2018 NeurIPS paper by Brandon Amos, Ivan Jimenez, Jacob Sacks, Byron Boots, and J. Zico Kolter, treats the entire MPC controller as a differentiable layer whose inputs are cost weights and dynamics parameters and whose output is the optimal first input.^[20] Differentiation is performed through the KKT conditions, so MPC parameters can be trained end to end by backpropagating from a downstream task loss, mixing the inductive bias of model based control with the flexibility of machine learning.^[20]

MPC also serves as the planner inside model based reinforcement learning and world model agents. The Model Predictive Path Integral approach (MPPI), introduced by Williams and colleagues in their 2017 ICRA paper on information theoretic MPC, samples many random input trajectories, scores them with a learned dynamics model, and weighs them softmax style to obtain a control update, giving a derivative free MPC variant that parallelizes on GPUs and was demonstrated for aggressive autonomous driving on the AutoRally platform.^[21] DeepMind's MuZero, presented in the 2020 Nature paper "Mastering Atari, Go, chess and shogi by planning with a learned model" (Nature 588, 604-609), is usually described as a reinforcement learning algorithm but can be read as MPC in which a learned latent model is unrolled by Monte Carlo tree search rather than by gradient based optimization; it reached superhuman play in Go, chess, shogi, and 57 Atari games without being told the rules.^[23] Other hybrid architectures use MPC as a short horizon safety filter that overrides an RL policy whenever it would violate constraints.^[22]

Strengths and Limitations

MPC's appeal stems from structural advantages over classical control. It handles MIMO systems and constraints in one unified design without ad hoc gain scheduling or anti windup logic. It incorporates feedforward information automatically because future references and known disturbances enter the optimization directly. It can optimize an essentially arbitrary cost function, letting engineers express trade offs explicitly. And it has a clear conceptual model that maps onto how humans plan.^[3]

The price is computational, modeling, and analytical complexity. MPC requires a model accurate enough over the prediction horizon. It demands an online optimization at every sample, straining real time computing resources for nonlinear and large scale problems. Stability and recursive feasibility require careful design of terminal cost, terminal set, and horizon length, with formal guarantees fragile under model mismatch.^[4] Tuning cost weights and constraint slack remains more art than science. In practice, deployment risk is managed by extensive simulation, hardware in the loop testing, conservative constraint margins, fallback PID controllers for solver failure, and monitoring of solve times on the running system.

How does MPC differ from LQR and reinforcement learning?

MPC sits in a family of overlapping techniques. Linear quadratic regulator (LQR) is the special case where the system is linear, the cost quadratic, the horizon infinite, and there are no constraints; MPC then reduces to a state feedback gain.^[10] Adding constraints is what makes MPC strictly more expressive than LQR. Dynamic programming and the Hamilton Jacobi Bellman (HJB) equation provide the broader optimal control framework that MPC tractably approximates; solving HJB directly suffers the curse of dimensionality, while MPC sidesteps this by recomputing locally at each sample.^[9] Reinforcement learning, particularly model based RL, can be viewed as a stochastic approximation of dynamic programming that learns a value function or policy from interaction. The two families are increasingly fused: a 2025 survey on the synthesis of MPC and reinforcement learning catalogs how MPC supplies a structured, constraint aware planner while RL supplies learned models, cost shaping, and value functions.^[24] Practical systems often run MPC in the fast inner loop under tight constraints and learned policies for high level decisions.

Recent Trends

Research on MPC remains exceptionally active in the mid 2020s, driven by the appetite for embodied AI in robotics, the compute that lets neural surrogates contribute meaningfully to control, and safety pressure pushing formal methods into applications previously owned by handcrafted controllers. Noteworthy current trends include real time NMPC at kilohertz rates on embedded GPUs, contact implicit MPC for legged locomotion that does not require a precomputed contact schedule, large vision language models as planners supplying targets to MPC controllers, neural Lyapunov design that learns terminal cost ingredients automatically, and safety filtering frameworks wrapping learned policies in MPC for certification.^[22] Standard textbooks by Camacho and Bordons, Rawlings, Mayne and Diehl, and Borrelli, Bemporad and Morari remain the canonical references.^[8]^[9]^[10]

References

Richalet, J., Rault, A., Testud, J. L., and Papon, J. (1978). "Model predictive heuristic control: Applications to industrial processes." *Automatica*, 14(5), 413 to 428. ↩
Cutler, C. R. and Ramaker, B. L. (1980). "Dynamic matrix control: a computer control algorithm." *Joint Automatic Control Conference*, San Francisco, CA. ↩
Garcia, C. E., Prett, D. M., and Morari, M. (1989). "Model predictive control: theory and practice, a survey." *Automatica*, 25(3), 335 to 348. ↩
Mayne, D. Q., Rawlings, J. B., Rao, C. V., and Scokaert, P. O. M. (2000). "Constrained model predictive control: stability and optimality." *Automatica*, 36(6), 789 to 814. ↩
Qin, S. J. and Badgwell, T. A. (2003). "A survey of industrial model predictive control technology." *Control Engineering Practice*, 11(7), 733 to 764. ↩
Bemporad, A., Morari, M., Dua, V., and Pistikopoulos, E. N. (2002). "The explicit linear quadratic regulator for constrained systems." *Automatica*, 38(1), 3 to 20. ↩
Diehl, M., Bock, H. G., and Schloeder, J. P. (2005). "A real time iteration scheme for nonlinear optimization in optimal feedback control." *SIAM Journal on Control and Optimization*, 43(5), 1714 to 1736. ↩
Camacho, E. F. and Bordons, C. (2007). *Model Predictive Control* (2nd edition). Springer, London. ↩
Rawlings, J. B., Mayne, D. Q., and Diehl, M. M. (2017). *Model Predictive Control: Theory, Computation, and Design* (2nd edition). Nob Hill Publishing. ↩
Borrelli, F., Bemporad, A., and Morari, M. (2017). *Predictive Control for Linear and Hybrid Systems*. Cambridge University Press. ↩
Mayne, D. Q. (2014). "Model predictive control: Recent developments and future promise." *Automatica*, 50(12), 2967 to 2986. ↩
Andersson, J. A. E., Gillis, J., Horn, G., Rawlings, J. B., and Diehl, M. (2019). "CasADi: a software framework for nonlinear optimization and optimal control." *Mathematical Programming Computation*, 11(1), 1 to 36.
Verschueren, R., Frison, G., Kouzoupis, D., Frey, J., van Duijkeren, N., Zanelli, A., Novoselnik, B., Albin, T., Quirynen, R., and Diehl, M. (2022). "acados: a modular open source framework for fast embedded optimal control." *Mathematical Programming Computation*, 14(1), 147 to 183. ↩
Lucia, S., Tatulea-Codrean, A., Schoppmeyer, C., and Engell, S. (2017). "Rapid development of modular and sustainable nonlinear model predictive control solutions using do-mpc and CasADi." *Control Engineering Practice*, 60, 51 to 62.
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., and Boyd, S. (2020). "OSQP: an operator splitting solver for quadratic programs." *Mathematical Programming Computation*, 12(4), 637 to 672. ↩
Sopasakis, P., Fresk, E., and Patrinos, P. (2020). "OpEn: code generation for embedded nonconvex optimization." *IFAC World Congress*.
Di Carlo, J., Wensing, P. M., Katz, B., Bledt, G., and Kim, S. (2018). "Dynamic locomotion in the MIT Cheetah 3 through convex model predictive control." *IEEE/RSJ IROS*. ↩
Grandia, R., Jenelten, F., Yang, S., Farshidian, F., and Hutter, M. (2023). "Perceptive locomotion through nonlinear model predictive control." *IEEE Transactions on Robotics*, 39(5), 3402 to 3421. ↩
Wieber, P. B., Tedrake, R., and Kuindersma, S. (2016). "Modeling and control of legged robots." In *Springer Handbook of Robotics* (2nd edition). ↩
Amos, B., Jimenez, I., Sacks, J., Boots, B., and Kolter, J. Z. (2018). "Differentiable MPC for end to end planning and control." *Advances in Neural Information Processing Systems (NeurIPS)*. ↩
Williams, G., Wagener, N., Goldfain, B., Drews, P., Rehg, J. M., Boots, B., and Theodorou, E. A. (2017). "Information theoretic MPC for model based reinforcement learning." *IEEE ICRA*. ↩
Hewing, L., Wabersich, K. P., Menner, M., and Zeilinger, M. N. (2020). "Learning based model predictive control: toward safe learning in control." *Annual Review of Control, Robotics, and Autonomous Systems*, 3, 269 to 296. ↩
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., and Silver, D. (2020). "Mastering Atari, Go, chess and shogi by planning with a learned model." *Nature*, 588, 604 to 609. ↩
Reiter, R., Ghezzi, A., Baumgaertner, K., Hoffmann, J., McAllister, R. T., and Diehl, M. (2025). "Synthesis of model predictive control and reinforcement learning: survey and classification." *arXiv:2502.02133*. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

Control theory DexBot Estun Codroid 02 Leju Kuavo-my LimX CL-1 LimX Dynamics Motion planning OpenLoong PAL Robotics PNDbotics Adam Lite Qinglong V3.0 Robot locomotion Robotics TALOS (robot)V-JEPA 2 Whole-body control Zero Moment Point (ZMP)

What is the core idea behind MPC?

Mathematical Formulation

When did MPC originate?

What are the main variants of MPC?

Implementation and Numerical Methods

What software is used to implement MPC?

What is MPC used for? Applications by domain

Process Industries

Automotive and Autonomous Driving

Drones and Aerial Robotics

How is MPC used in legged robots?

How does MPC relate to reinforcement learning and world models?

Strengths and Limitations

How does MPC differ from LQR and reinforcement learning?

Recent Trends

See Also

References

Improve this article

Related Articles

Whole-body control

L0 Regularization

L1 Loss

L1 Regularization

L2 Loss

L2 Regularization

What links here

Related Articles

Whole-body control

L0 Regularization

L1 Loss

L1 Regularization

L2 Loss

L2 Regularization

What links here