DPM-Solver

Deep Learning Generative AI

9 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

7 citations

Revision

v2 · 1,812 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

What is DPM-Solver?

DPM-Solver is a fast, training-free high-order numerical solver for the ordinary differential equations (ODEs) that arise when sampling from diffusion models, and it lets a pretrained model generate high-quality samples in roughly 10 to 20 evaluations of the underlying neural network instead of the hundreds or thousands the original samplers required ^[1]. It was introduced in 2022 by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu of Tsinghua University's TSAIL group in the paper "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps," presented as an oral at NeurIPS 2022 ^[1]. On CIFAR-10 the method reported a Frechet Inception Distance (FID) of 4.70 using only 10 function evaluations and 2.87 using 20, and a 4x to 16x speedup over previous training-free samplers, all without modifying or retraining the model ^[1].

The core idea is to exploit the semi-linear structure of the diffusion ODE. The equation has a part that is linear in the data and a part given by the neural network. DPM-Solver computes the linear part analytically rather than handing the whole equation to a generic black-box solver, then approximates only the remaining nonlinear part with high-order schemes. This dramatically reduces the discretization error per step and therefore the number of steps needed for a given quality ^[1]. As the authors put it, the formulation "analytically computes the linear part of the solution, rather than leaving all terms to black-box ODE solvers as adopted in previous works" ^[1].

DPM-Solver and its follow-up, DPM-Solver++, became among the most widely deployed fast samplers in practice. They are implemented in the Hugging Face Diffusers library and are exposed in popular Stable Diffusion interfaces under names such as "DPM++ 2M" and "DPM++ 2M Karras," where they are common default or recommended choices ^[2]^[3].

Why is diffusion sampling slow?

A diffusion model is trained by gradually adding Gaussian noise to data and learning to reverse that process. The forward process at time t produces a noisy sample x_t = alpha_t * x_0 + sigma_t * noise, where alpha_t and sigma_t are fixed noise-schedule coefficients. Sampling reverses this, transforming pure noise back into data. In the framework of score-based generative models, this reverse process can be written either as a stochastic differential equation (SDE) or as an equivalent deterministic probability-flow ordinary differential equation (ODE) that shares the same marginal distributions ^[1].

Solving the reverse SDE or ODE naively requires many small steps. As the DPM-Solver paper notes, diffusion models "generally need hundreds or thousands of sequential function evaluations (steps) of large neural networks to draw a sample" ^[1]. The original DDPM ancestral sampler often used hundreds to one thousand network evaluations, because each evaluation of the neural network ("function evaluation," or NFE) is the dominant cost. Generic off-the-shelf ODE integrators such as the Euler method or higher-order Runge-Kutta schemes like Heun's method treat the entire right-hand side as an opaque function, so they converge slowly and waste evaluations on the part of the dynamics that is actually known in closed form.

Several faster samplers preceded DPM-Solver. DDIM (Denoising Diffusion Implicit Models) reinterpreted sampling as a deterministic non-Markovian process and cut the count to tens of steps. PNDM (Pseudo Numerical Methods for Diffusion Models, arXiv 2202.09778) applied pseudo-numerical multistep schemes on the data manifold ^[4]. Concurrently with DPM-Solver, DEIS (Diffusion Exponential Integrator Sampler, arXiv 2204.13902, ICLR 2023) independently proposed using exponential integrators that leverage the same semi-linear structure ^[5]. DPM-Solver formalized this approach with explicit convergence-order guarantees for diffusion ODEs.

How does DPM-Solver sample in around 10 steps?

DPM-Solver is built on the observation that the diffusion ODE is semi-linear. Written for the noise-prediction parameterization, the probability-flow ODE takes the form

dx_t / dt = f(t) * x_t + g(t) * (network output),

where the term f(t) * x_t is linear in the current state and the network term is the nonlinear part. A naive solver discretizes this whole expression. The variation-of-constants formula from the theory of linear ODEs gives the exact contribution of the linear part, leaving only the nonlinear integral to approximate ^[1].

The key technical step is a change of variables to lambda_t = log(alpha_t / sigma_t), the half log signal-to-noise ratio (log-SNR). Because lambda_t is a strictly monotonic function of time, it can serve as the new integration variable. Under this substitution, the exact solution of the diffusion ODE between two times s and t becomes an exponentially weighted integral of the neural network evaluated against lambda ^[1]. The linear factors collapse into closed-form exponential coefficients, and what remains is a clean integral of the network output that DPM-Solver approximates with a local Taylor expansion in lambda.

Approximating that integral to different orders yields a family of solvers:

Solver	Order	Network evaluations per step	Notes
DPM-Solver-1	1st	1	Mathematically equivalent to DDIM
DPM-Solver-2	2nd	2	One intermediate (midpoint-style) evaluation
DPM-Solver-3	3rd	3	Two intermediate evaluations

Increasing the order reduces discretization error, so the higher-order variants reach a target quality in fewer total steps. Notably, the first-order member of the family is exactly DDIM, which the paper proves is a special case of DPM-Solver ^[1]^[4]. This approach belongs to the class of exponential integrators, numerical methods designed for semi-linear problems that treat the linear (often stiff) part exactly via an integrating factor and only approximate the nonlinear remainder.

The paper reports that DPM-Solver "can generate high-quality samples in only 10 to 20 function evaluations on various datasets" ^[1]. Concretely, on CIFAR-10 it achieved an FID of 4.70 in 10 function evaluations and 2.87 in 20, alongside a 4x to 16x speedup over previous state-of-the-art training-free samplers across several datasets and models ^[1]. The method applies to both discrete-time and continuous-time diffusion models without retraining.

What is DPM-Solver++ and how does it handle guided sampling?

The original DPM-Solver was validated mainly on unconditional and lightly guided generation. Real text-to-image systems rely on strong classifier-free guidance, which amplifies the difference between conditional and unconditional predictions by a large guidance scale. In a follow-up paper, DPM-Solver++ (arXiv 2211.01095, November 2022), the same authors showed that under large guidance scales the high-order solvers can become unstable and even slower than first-order DDIM, because the large scale narrows the convergence radius of the high-order expansion and pushes intermediate iterates outside the valid data range, causing a train-test mismatch ^[6]. The paper notes that DDIM, the common first-order baseline, "generally needs 100 to 250 steps for high-quality samples" in the guided setting ^[6].

DPM-Solver++ addresses this with three changes ^[6]:

Data-prediction parameterization. Instead of solving the ODE in terms of the noise-prediction network, DPM-Solver++ reformulates it around the data-prediction model x_theta = (x_t - sigma_t * (noise prediction)) / alpha_t, which directly predicts the clean sample. This parameterization is better behaved under strong guidance.
Thresholding. It applies dynamic thresholding (introduced for the Imagen model by Saharia et al.) to clip predicted samples back into the valid data range at each step, mitigating the train-test mismatch.
Multistep schemes. It introduces multistep variants that reuse network outputs from previous steps (in the style of Adams-Bashforth methods) to achieve high order while keeping the effective step size small, which restores stability.

These produce the variants commonly labeled 2S (second-order singlestep), 2M (second-order multistep), and 3M (third-order multistep) ^[6]. With these fixes, DPM-Solver++ "can generate high-quality samples within only 15 to 20 steps for guided sampling by pixel-space and latent-space DPMs" ^[6].

Where is DPM-Solver used, and how does it relate to other samplers?

DPM-Solver and DPM-Solver++ are now standard fast samplers in the diffusion ecosystem. In Hugging Face Diffusers, the multistep variant is implemented as the DPMSolverMultistepScheduler, and a singlestep version exists as DPMSolverSinglestepScheduler; setting the option use_karras_sigmas=True selects the noise-schedule discretization proposed by Karras et al., producing the popular "DPM++ 2M Karras" configuration ^[2]^[3]. In Stable Diffusion interfaces such as AUTOMATIC1111 and ComfyUI, samplers like "DPM++ 2M Karras," "DPM++ SDE Karras," and "DPM++ 3M SDE" are frequently recommended for their balance of speed and quality, typically run at 20 to 30 steps ^[3]. Stability AI's early Stable Diffusion demo used DPM-Solver to roughly double sampling speed by cutting steps from 50 to about 25 ^[1].

DPM-Solver sits within a lineage of fast deterministic samplers and is closely related to several of them:

DDIM is its first-order special case ^[1]^[4].
PNDM also uses multistep numerical methods but works in the time domain on the data manifold, whereas DPM-Solver works in the log-SNR domain using exponential integrators ^[4].
DEIS independently derived an exponential-integrator solver from the same semi-linear structure at nearly the same time and is implemented as DEISMultistepScheduler ^[5].
UniPC (Unified Predictor-Corrector, arXiv 2302.04867) generalized these ideas into a predictor-corrector framework that can wrap DPM-Solver-style steps for further gains at very low step counts.

The authors later released DPM-Solver-v3 (arXiv 2310.13268, NeurIPS 2023), which improves the solver using precomputed "empirical model statistics" of the network to reduce error further in the 5-to-10-step regime ^[7].

Why does DPM-Solver matter?

DPM-Solver demonstrated that much of the slowness of diffusion sampling was a property of the solver, not an inherent cost of the model. By recognizing the diffusion ODE as a semi-linear problem and solving its linear part exactly in the log-SNR variable, it brought sampling down to an order of ten network evaluations while keeping the model fixed and providing formal convergence guarantees. This made high-quality diffusion sampling fast enough for interactive use and large-scale deployment, and DPM-Solver++ extended that benefit to the strongly guided text-to-image setting that dominates practical use.

Together with DDIM, PNDM, DEIS, and later work such as UniPC and consistency-style distillation, DPM-Solver helped shift the bottleneck of diffusion generation away from sampling speed. Its descendants, especially the DPM++ multistep solvers, remain default or near-default samplers in widely used diffusion toolchains as of 2026 ^[2]^[3].

References

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J. "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps." NeurIPS 2022 (Oral). arXiv:2206.00927. https://arxiv.org/abs/2206.00927 ↩
Hugging Face Diffusers documentation, schedulers overview (DPMSolverMultistepScheduler, DPMSolverSinglestepScheduler, use_karras_sigmas). https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/schedulers/overview.md ↩
"Stable Diffusion Samplers: A Comprehensive Guide." Stable Diffusion Art. https://stable-diffusion-art.com/samplers/ ↩
Liu, L., Ren, Y., Lin, Z., Zhao, Z. "Pseudo Numerical Methods for Diffusion Models on Manifolds." ICLR 2022. arXiv:2202.09778. https://arxiv.org/abs/2202.09778 ↩
Zhang, Q., Chen, Y. "Fast Sampling of Diffusion Models with Exponential Integrator (DEIS)." ICLR 2023. arXiv:2204.13902. https://arxiv.org/abs/2204.13902 ↩
Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J. "DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models." 2022. arXiv:2211.01095. https://arxiv.org/abs/2211.01095 ↩
Zheng, K., Lu, C., Chen, J., Zhu, J. "DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics." NeurIPS 2023. arXiv:2310.13268. https://arxiv.org/abs/2310.13268 ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Diffusion Transformer (DiT)Stable Diffusion Text-to-Image Models

What is DPM-Solver?

Why is diffusion sampling slow?

How does DPM-Solver sample in around 10 steps?

What is DPM-Solver++ and how does it handle guided sampling?

Where is DPM-Solver used, and how does it relate to other samplers?

Why does DPM-Solver matter?

References

Improve this article

Related Articles

Diffusion model

AudioCraft

GAN

Generative Model

Autoencoder

Latent diffusion model

What links here

Related Articles

Diffusion model

AudioCraft

GAN

Generative Model

Autoencoder

Latent diffusion model

What links here