DPM-Solver
Last reviewed
Jun 8, 2026
Sources
7 citations
Review status
Source-backed
Revision
v1 · 1,644 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
7 citations
Review status
Source-backed
Revision
v1 · 1,644 words
Add missing citations, update stale details, or suggest a clearer explanation.
DPM-Solver is a dedicated high-order numerical solver for the ordinary differential equations that arise when sampling from diffusion models. Introduced in 2022 by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu of Tsinghua University's TSAIL group, it allows a trained diffusion model to produce high-quality samples in roughly 10 to 20 evaluations of the underlying neural network, rather than the hundreds or thousands required by the original samplers [1]. The method is training-free: it accelerates any pretrained diffusion model without modifying the network or retraining it.
The core idea is to exploit the semi-linear structure of the diffusion ODE. The equation has a part that is linear in the data and a part given by the neural network. DPM-Solver computes the linear part analytically rather than handing the whole equation to a generic black-box solver, then approximates only the remaining nonlinear part with high-order schemes. This dramatically reduces the discretization error per step and therefore the number of steps needed for a given quality [1].
DPM-Solver and its follow-up, DPM-Solver++, became among the most widely deployed fast samplers in practice. They are implemented in the Hugging Face Diffusers library and are exposed in popular Stable Diffusion interfaces under names such as "DPM++ 2M" and "DPM++ 2M Karras," where they are common default or recommended choices [2][3].
A diffusion model is trained by gradually adding Gaussian noise to data and learning to reverse that process. The forward process at time t produces a noisy sample x_t = alpha_t * x_0 + sigma_t * noise, where alpha_t and sigma_t are fixed noise-schedule coefficients. Sampling reverses this, transforming pure noise back into data. In the framework of score-based generative models, this reverse process can be written either as a stochastic differential equation (SDE) or as an equivalent deterministic probability-flow ordinary differential equation (ODE) that shares the same marginal distributions [1].
Solving the reverse SDE or ODE naively requires many small steps. The original DDPM ancestral sampler often used hundreds to one thousand network evaluations, because each evaluation of the neural network ("function evaluation," or NFE) is the dominant cost. Generic off-the-shelf ODE integrators such as the Euler method or higher-order Runge-Kutta schemes like Heun's method treat the entire right-hand side as an opaque function, so they converge slowly and waste evaluations on the part of the dynamics that is actually known in closed form.
Several faster samplers preceded DPM-Solver. DDIM (Denoising Diffusion Implicit Models) reinterpreted sampling as a deterministic non-Markovian process and cut the count to tens of steps. PNDM (Pseudo Numerical Methods for Diffusion Models, arXiv 2202.09778) applied pseudo-numerical multistep schemes on the data manifold [4]. Concurrently with DPM-Solver, DEIS (Diffusion Exponential Integrator Sampler, arXiv 2204.13902, ICLR 2023) independently proposed using exponential integrators that leverage the same semi-linear structure [5]. DPM-Solver formalized this approach with explicit convergence-order guarantees for diffusion ODEs.
DPM-Solver is built on the observation that the diffusion ODE is semi-linear. Written for the noise-prediction parameterization, the probability-flow ODE takes the form
dx_t / dt = f(t) * x_t + g(t) * (network output),
where the term f(t) * x_t is linear in the current state and the network term is the nonlinear part. A naive solver discretizes this whole expression. The variation-of-constants formula from the theory of linear ODEs gives the exact contribution of the linear part, leaving only the nonlinear integral to approximate [1].
The key technical step is a change of variables to lambda_t = log(alpha_t / sigma_t), the half log signal-to-noise ratio (log-SNR). Because lambda_t is a strictly monotonic function of time, it can serve as the new integration variable. Under this substitution, the exact solution of the diffusion ODE between two times s and t becomes an exponentially weighted integral of the neural network evaluated against lambda [1]. The linear factors collapse into closed-form exponential coefficients, and what remains is a clean integral of the network output that DPM-Solver approximates with a local Taylor expansion in lambda.
Approximating that integral to different orders yields a family of solvers:
| Solver | Order | Network evaluations per step | Notes |
|---|---|---|---|
| DPM-Solver-1 | 1st | 1 | Mathematically equivalent to DDIM |
| DPM-Solver-2 | 2nd | 2 | One intermediate (midpoint-style) evaluation |
| DPM-Solver-3 | 3rd | 3 | Two intermediate evaluations |
Increasing the order reduces discretization error, so the higher-order variants reach a target quality in fewer total steps. Notably, the first-order member of the family is exactly DDIM, which the paper proves is a special case of DPM-Solver [1][4]. This approach belongs to the class of exponential integrators, numerical methods designed for semi-linear problems that treat the linear (often stiff) part exactly via an integrating factor and only approximate the nonlinear remainder.
On CIFAR-10, DPM-Solver reported a Frechet Inception Distance (FID) of 4.70 using only 10 function evaluations and 2.87 using 20, and a 4x to 16x speedup over previous training-free samplers across several datasets and models [1]. The method applies to both discrete-time and continuous-time diffusion models without retraining.
The original DPM-Solver was validated mainly on unconditional and lightly guided generation. Real text-to-image systems rely on strong classifier-free guidance, which amplifies the difference between conditional and unconditional predictions by a large guidance scale. In a follow-up paper, DPM-Solver++ (arXiv 2211.01095, November 2022), the same authors showed that under large guidance scales the high-order solvers can become unstable and even slower than first-order DDIM, because the large scale narrows the convergence radius of the high-order expansion and pushes intermediate iterates outside the valid data range, causing a train-test mismatch [6].
DPM-Solver++ addresses this with three changes [6]:
These produce the variants commonly labeled 2S (second-order singlestep), 2M (second-order multistep), and 3M (third-order multistep) [6]. With these fixes, DPM-Solver++ generates high-quality guided samples in roughly 15 to 20 steps for both pixel-space and latent-space diffusion models [6].
DPM-Solver and DPM-Solver++ are now standard fast samplers in the diffusion ecosystem. In Hugging Face Diffusers, the multistep variant is implemented as the DPMSolverMultistepScheduler, and a singlestep version exists as DPMSolverSinglestepScheduler; setting the option use_karras_sigmas=True selects the noise-schedule discretization proposed by Karras et al., producing the popular "DPM++ 2M Karras" configuration [2][3]. In Stable Diffusion interfaces such as AUTOMATIC1111 and ComfyUI, samplers like "DPM++ 2M Karras," "DPM++ SDE Karras," and "DPM++ 3M SDE" are frequently recommended for their balance of speed and quality, typically run at 20 to 30 steps [3]. Stability AI's early Stable Diffusion demo used DPM-Solver to roughly double sampling speed by cutting steps from 50 to about 25 [1].
DPM-Solver sits within a lineage of fast deterministic samplers and is closely related to several of them:
The authors later released DPM-Solver-v3 (arXiv 2310.13268, NeurIPS 2023), which improves the solver using precomputed "empirical model statistics" of the network to reduce error further in the 5-to-10-step regime [7].
DPM-Solver demonstrated that much of the slowness of diffusion sampling was a property of the solver, not an inherent cost of the model. By recognizing the diffusion ODE as a semi-linear problem and solving its linear part exactly in the log-SNR variable, it brought sampling down to an order of ten network evaluations while keeping the model fixed and providing formal convergence guarantees. This made high-quality diffusion sampling fast enough for interactive use and large-scale deployment, and DPM-Solver++ extended that benefit to the strongly guided text-to-image setting that dominates practical use.
Together with DDIM, PNDM, DEIS, and later work such as UniPC and consistency-style distillation, DPM-Solver helped shift the bottleneck of diffusion generation away from sampling speed. Its descendants, especially the DPM++ multistep solvers, remain default or near-default samplers in widely used diffusion toolchains as of 2026 [2][3].