# SmoothGrad

> Source: https://aiwiki.ai/wiki/smoothgrad
> Updated: 2026-07-07
> Categories: Deep Learning, Interpretability
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**SmoothGrad** is a [saliency map](/wiki/saliency_map) technique that reduces visual noise in [gradient](/wiki/gradient)-based explanations of [neural network](/wiki/neural_network) predictions by averaging gradients over many noisy copies of the input. The method was introduced by Daniel Smilkov, Nikhil Thorat, [Been Kim](/wiki/been_kim), [Fernanda Viégas](/wiki/fernanda_viegas), and [Martin Wattenberg](/wiki/martin_wattenberg) of the [Google Brain](/wiki/google_brain) PAIR (People + AI Research) group in the paper "SmoothGrad: removing noise by adding noise," presented at the [ICML](/wiki/icml) 2017 Workshop on Visualization for Deep Learning. [1]

The core idea is simple enough that it fits in two lines of code. Vanilla gradient saliency maps, which display the partial derivative of a class score with respect to each input pixel, often look speckled and hard to read. SmoothGrad adds [Gaussian noise](/wiki/gaussian_noise) to the input several times, computes a saliency map for each noisy copy, and averages the results. The original paper recommends averaging roughly 10 to 50 noisy samples, with the added Gaussian noise set to about 10% to 20% of the input's value range [1]. The resulting heatmap is much cleaner and tends to highlight contiguous regions of the object that the model actually used.

SmoothGrad sits inside the broader field of [explainable AI](/wiki/explainable_ai) and [interpretability](/wiki/interpretability). It is closely related to other gradient-based attribution methods such as [Integrated Gradients](/wiki/integrated_gradients), [Grad-CAM](/wiki/grad_cam), [SHAP](/wiki/shap), and [Layer-wise Relevance Propagation](/wiki/lrp). It is also one of the methods that has been most heavily scrutinized by the literature on whether saliency maps actually explain anything, particularly the 2018 "Sanity Checks" study and the 2017 "unreliability" paper covered later in this article.

## Background

Visualizing what a [convolutional neural network](/wiki/convolutional_neural_network) cares about in an image is one of the oldest problems in [computer vision](/wiki/computer_vision) interpretability. The simplest approach, the *vanilla gradient* or *sensitivity map*, was popularized by Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman in 2013. [2] Given a trained classifier `S(x)`, pick the class of interest `c` and compute

```
M(x) = dS_c(x) / dx
```

The absolute value of `M(x)` at each pixel tells you how much the score `S_c` would change if you nudged that pixel. The map is fast: one [backpropagation](/wiki/backpropagation) pass. It is also, in practice, very noisy. Even on clean [ImageNet](/wiki/imagenet) photographs, the heatmap looks like confetti, with bright pixels appearing far outside the object the network supposedly recognized.

A few alternative methods tried to clean this up before SmoothGrad arrived. *DeconvNet* (Matthew Zeiler and Rob Fergus, 2014) [8] and *Guided Backpropagation* (Springenberg and colleagues, 2014) [7] modify how gradients flow through [ReLU](/wiki/relu) layers to produce visually crisper maps. *Class Activation Mapping* (CAM) and its successor [Grad-CAM](/wiki/grad_cam) (Selvaraju and colleagues, 2017) [6] project class-specific activations from the last convolutional layer back onto the image, producing coarse heatmaps that highlight the right region but lose pixel detail. *[Integrated Gradients](/wiki/integrated_gradients)* (Sundararajan, Taly, and Yan, 2017) [5] integrates the gradient along a path from a [baseline](/wiki/baseline) to the input, which fixes saturation problems but still produces fairly noisy maps for [image classification](/wiki/image_classification).

Smilkov and colleagues asked a different question. What if you keep the vanilla gradient but evaluate it at many points near the input and average? The intuition was that noise in vanilla saliency maps comes from `S_c(x)` being highly nonlinear, so its derivative fluctuates wildly from pixel to pixel even when the network's overall decision is stable. Averaging over a small neighborhood should suppress these fluctuations while preserving the gradient direction that matters for the prediction.

## How does SmoothGrad work?

SmoothGrad replaces the vanilla saliency map with a Monte Carlo estimate of a smoothed gradient. The authors describe the procedure plainly: "take an image of interest, sample similar images by adding noise to the image, then take the average of the resulting sensitivity maps for each sampled image." [1] Given an input `x`, the method draws `n` noise samples from a Gaussian distribution `N(0, sigma^2)` with the same shape as `x`, adds each one to the input, computes a vanilla saliency map for each noisy version, and averages:

```
M_hat(x) = (1 / n) * sum_{i=1}^{n} M(x + g_i),  where g_i ~ N(0, sigma^2)
```

That is the entire algorithm. Two hyperparameters control its behavior:

- `n`, the sample count. The original paper recommends values in the range of about 10 to 50 samples. The authors "empirically found a diminishing return," reporting "little apparent change in the visualizations for n>50." [1] The improvement in visual quality saturates beyond that.
- `sigma`, the noise standard deviation. The paper expresses this as a fraction of the input dynamic range, so `sigma / (x_max - x_min)`. Values of roughly 10% to 20% of the input range work well for natural images [1]. Too little noise leaves the original speckle pattern intact. Too much noise blurs the explanation past the object boundary.

The paper also introduces a few useful variants. *SmoothGrad squared*, written `SmoothGrad^2`, takes `M(x + g)^2` before averaging, which biases the heatmap toward features that consistently produce a large gradient regardless of sign. *VarGrad* replaces the mean with the variance of `M(x + g_i)` across samples. VarGrad highlights pixels where the gradient changes a lot under perturbation, which intuitively flags regions the model is uncertain about. The same averaging trick can be combined with other base methods. Replacing `M(x)` with the Integrated Gradients attribution gives *Smooth IG*, sometimes called *Smoothed Integrated Gradients*, which is a default in some interpretability libraries.

A practical detail: SmoothGrad multiplies explanation cost by `n`. SmoothGrad with 50 samples takes 50 backward passes per image. For interactive visualization on small networks this is fine. For large [transformer](/wiki/transformer) models, it adds up, which is why most production pipelines use smaller `n` than the original paper.

### Why does adding noise help?

The paper offers an intuition rather than a tight theoretical proof. Its explanation is that "the derivative of the function S_c may fluctuate sharply at small scales," so "the apparent noise one sees in a sensitivity map may be due to essentially meaningless local variations in partial derivatives." [1] Treat `S_c(x)` as a function with two components: a smooth signal that captures what the model is doing, and a high-frequency component left over from the network's nonlinearities. Vanilla gradients pick up both. Local averaging acts like a low-pass filter on the gradient field; the smooth signal survives, and the high-frequency noise mostly cancels out. With a Gaussian kernel, SmoothGrad effectively computes the gradient of a function that has been convolved with a Gaussian, a standard image-processing trick repurposed for saliency.

A second reading is that adding noise during explanation mimics what [data augmentation](/wiki/data_augmentation) does during training. If the model was trained to be robust to small perturbations, averaging over similar perturbations during explanation gives you a feature attribution that aligns with the part of the input the model actually depends on, rather than artifacts of how the loss surface curves at one specific point.

## Variants and extensions

The SmoothGrad recipe has been combined with most other attribution methods. Common variants:

| Variant | Definition | What it emphasizes |
|---|---|---|
| SmoothGrad | Mean of `M(x + g)` over `n` samples | Consistent gradient direction near `x` |
| SmoothGrad squared | Mean of `M(x + g)^2` | Magnitude of gradient regardless of sign |
| VarGrad | Variance of `M(x + g)` | Regions of high gradient instability |
| Smooth IG | Mean of Integrated Gradients over noisy inputs | Path-integrated attribution with noise reduction |
| Smooth Grad-CAM++ | Smoothed Grad-CAM++ | Cleaner class activation heatmaps |

VarGrad is sometimes underrated. Adebayo and colleagues found that VarGrad behaved more sensibly than vanilla SmoothGrad on several sanity checks (covered below), and a later benchmark by Hooker and colleagues found that VarGrad and SmoothGrad squared were the only tested methods that reliably beat random pixel ranking [10], which is a quiet result given how rarely VarGrad shows up in tutorials.

## How does SmoothGrad compare with other saliency methods?

SmoothGrad sits in the gradient-based attribution family alongside several other methods. Some of them predate it and take a different route: Layer-wise Relevance Propagation (Bach and colleagues, 2015) decomposes a prediction backward through the network layer by layer rather than reading a single raw gradient [9].

| Method | Year | Mechanism | Output type |
|---|---|---|---|
| Vanilla gradient (Simonyan et al.) | 2013 | Single backward pass | Pixel-level sensitivity |
| DeconvNet (Zeiler & Fergus) | 2014 | Modified ReLU backward pass | Pixel-level reconstruction |
| Guided Backprop (Springenberg et al.) | 2014 | Vanilla + DeconvNet rules | Sharp pixel-level maps |
| LRP (Bach et al.) | 2015 | Layer-wise relevance propagation | Pixel-level attribution |
| Grad-CAM (Selvaraju et al.) | 2017 | Gradient-weighted CAM | Coarse class-specific heatmap |
| Integrated Gradients (Sundararajan et al.) | 2017 | Path integral from baseline to input | Pixel-level attribution |
| SmoothGrad (Smilkov et al.) | 2017 | Average gradient over noised inputs | Pixel-level sensitivity, denoised |
| SHAP (Lundberg & Lee) | 2017 | Shapley value approximation | Feature-level attribution |

A point that is easy to miss: SmoothGrad and Grad-CAM answer different questions. Grad-CAM tells you which spatial region the network used. SmoothGrad tells you which pixels, at input resolution, the gradient depends on. The two are often combined.

## How reliable is SmoothGrad? Critiques and limitations

SmoothGrad is one of the most-tested saliency methods, partly because it is simple to reproduce and partly because it is widely cited. Three studies in particular shaped the conversation.

The first is *Sanity Checks for Saliency Maps* by Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim, [NeurIPS](/wiki/neurips) 2018. [3] The authors proposed two tests: a model parameter randomization test (saliency maps should change if you randomize the model's weights) and a data randomization test (maps should change if you retrain on relabeled data). Methods that fail look like edge detectors that depend mostly on the input image, regardless of what the model learned. Vanilla gradient, Integrated Gradients, and SmoothGrad all passed the parameter randomization test, while Guided Backprop and Guided Grad-CAM produced visually similar maps even with a randomized model. VarGrad performed especially well, which is why it has gained a quiet following.

The second study is *The (Un)reliability of Saliency Methods* by Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof Schütt, Sven Dähne, Dumitru Erhan, and Been Kim (2017, arXiv:1711.00867). [4] The paper showed that several methods, including SmoothGrad, are not invariant under simple input transformations: adding a constant to every pixel changes the explanation even though the model's prediction is unchanged. SmoothGrad was not the worst offender (gradient times input was), but it was not immune either.

The third is *A Benchmark for Interpretability Methods in Deep Neural Networks* by Sara Hooker, Dumitru Erhan, Pieter-Jan Kindermans, and Been Kim (NeurIPS 2019), which introduced the ROAR (Remove And Retrain) protocol: rank pixels by importance, erase the top-ranked ones, retrain, and measure how much accuracy falls. Hooker and colleagues reported that "only certain ensemble based approaches ... outperform such a random assignment of importance," and identified those approaches as VarGrad and SmoothGrad squared, while base gradients, Integrated Gradients, and Guided Backprop were on par with or worse than a random ranking [10]. The finding is a double-edged endorsement: it validates the squared and variance forms of the SmoothGrad idea while cautioning that the plain averaged gradient is not, on its own, a reliable importance estimate.

A further recurring critique is computational cost. Running an explanation that requires 25 to 50 backward passes per image is fine for offline analysis but uncomfortable for interactive UIs and infeasible for very large models.

Finally, SmoothGrad inherits a deeper limitation of all gradient-based methods: it answers "how would the score change if I perturbed the input?" rather than "why did the model decide this is a dog?" Those are related but not identical. For medical imaging, sensitivity is a reasonable proxy for explanation. For model debugging, it can highlight unimportant features that the model has not yet learned to ignore.

## Which libraries implement SmoothGrad?

SmoothGrad is small enough that any deep learning framework can implement it in a few dozen lines, and most interpretability libraries include it as a built-in option. The original authors released a reference implementation in the PAIR-code/saliency repository, which still ships SmoothGrad, VarGrad, and Smooth IG [12].

| Library | Framework | Notes |
|---|---|---|
| PAIR-code/saliency | [TensorFlow](/wiki/tensorflow) | Original Google PAIR reference; includes SmoothGrad, VarGrad, Smooth IG |
| Captum | [PyTorch](/wiki/pytorch) | NoiseTunnel wrapper applies SmoothGrad to any underlying attribution method |
| tf-keras-vis | TensorFlow / Keras | Includes SmoothGrad alongside Grad-CAM and Score-CAM |
| iNNvestigate | TensorFlow / Keras | Supports SmoothGrad and many alternatives |
| Quantus | Framework-agnostic | Benchmarks SmoothGrad on faithfulness and robustness metrics |
| Alibi | TensorFlow / PyTorch | Production-oriented explanation library |

The Captum approach is worth singling out. Rather than implement SmoothGrad as a standalone method, Captum exposes a `NoiseTunnel` class that takes any base attribution method and applies SmoothGrad-style noise averaging on top of it [11]. `NoiseTunnel(IntegratedGradients(model))` gives Smooth IG. `NoiseTunnel(Saliency(model))` gives classic SmoothGrad. This composability matches the spirit of the original paper, which presented SmoothGrad as a transformation rather than a method.

## What is SmoothGrad used for?

SmoothGrad has been applied wherever gradient saliency is used:

- *Medical imaging*: Radiology and pathology models use SmoothGrad to highlight which pixels of an X-ray, MRI, or histology slide drove a diagnosis. Denoised maps are easier for clinicians to read than vanilla gradients, though the Adebayo critique applies here too.
- *Model debugging*: Developers use SmoothGrad to catch models that latched onto background features. The husky-vs-wolf classifier that turned out to be a snow-vs-grass detector is the canonical example.
- *Adversarial robustness*: SmoothGrad-style averaging is used as a defense against gradient-based adversarial attacks, since attackers cannot easily exploit a gradient that has been averaged over noise.
- *Comparative research*: SmoothGrad is a standard baseline in any paper proposing a new saliency method.
- *Beyond images*: The math does not care about modality. SmoothGrad has been applied to text (with noise in embedding space) and scientific data like genomics.

## How influential is SmoothGrad?

The original ICML 2017 workshop paper had accumulated more than 2,000 citations indexed by Semantic Scholar as of 2026, putting it among the most widely cited saliency-method papers of its era [13]. Part of the appeal is the simplicity: SmoothGrad is a four-line change to any existing gradient saliency pipeline. Part is the visual impact, since the side-by-side comparisons in the paper are striking even to readers without an ML background.

The method became a workhorse in interpretability evaluation. When researchers propose a new explanation technique, they typically benchmark against SmoothGrad as the denoised vanilla gradient and against Integrated Gradients as a path-based alternative. [Been Kim](/wiki/been_kim) went on to develop Concept Activation Vectors (TCAV) and other concept-based interpretability methods. [Fernanda Viégas](/wiki/fernanda_viegas) and [Martin Wattenberg](/wiki/martin_wattenberg) led Google's PAIR group, which produced TensorFlow Playground, the What-If Tool, and the Embedding Projector. SmoothGrad fits inside that broader program of making model behavior visible to humans without overclaiming what the resulting visualizations actually mean.

## See also

- [Saliency map](/wiki/saliency_map)
- [Integrated Gradients](/wiki/integrated_gradients)
- [Grad-CAM](/wiki/grad_cam)
- [SHAP](/wiki/shap)
- [LRP](/wiki/lrp)
- [Explainable AI](/wiki/explainable_ai)
- [Interpretability](/wiki/interpretability)
- [Captum](/wiki/captum)

## References

1. Smilkov, D., Thorat, N., Kim, B., Viégas, F., & Wattenberg, M. (2017). "SmoothGrad: removing noise by adding noise." Workshop on Visualization for Deep Learning, ICML 2017. arXiv:1706.03825.
2. Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps." arXiv:1312.6034.
3. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). "Sanity Checks for Saliency Maps." Advances in Neural Information Processing Systems 31 (NeurIPS 2018). arXiv:1810.03292.
4. Kindermans, P-J., Hooker, S., Adebayo, J., Alber, M., Schütt, K. T., Dähne, S., Erhan, D., & Kim, B. (2017). "The (Un)reliability of Saliency Methods." arXiv:1711.00867.
5. Sundararajan, M., Taly, A., & Yan, Q. (2017). "Axiomatic Attribution for Deep Networks." Proceedings of the 34th International Conference on Machine Learning (ICML 2017). arXiv:1703.01365.
6. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization." Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017). arXiv:1610.02391.
7. Springenberg, J. T., Dosovitskiy, A., Brox, T., & Riedmiller, M. (2014). "Striving for Simplicity: The All Convolutional Net." arXiv:1412.6806.
8. Zeiler, M. D., & Fergus, R. (2014). "Visualizing and Understanding Convolutional Networks." European Conference on Computer Vision (ECCV 2014). arXiv:1311.2901.
9. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K-R., & Samek, W. (2015). "On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation." PLOS ONE, 10(7), e0130140.
10. Hooker, S., Erhan, D., Kindermans, P-J., & Kim, B. (2019). "A Benchmark for Interpretability Methods in Deep Neural Networks." Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv:1806.10758.
11. Kokhlikyan, N., Miglani, V., Martin, M., Wang, E., Alsallakh, B., Reynolds, J., et al. (2020). "Captum: A unified and generic model interpretability library for PyTorch." arXiv:2009.07896.
12. PAIR-code/saliency GitHub repository, https://github.com/PAIR-code/saliency, accessed 2026.
13. Semantic Scholar. "SmoothGrad: removing noise by adding noise." https://www.semanticscholar.org/paper/f538dca4def5167a32fbc12107b69a05f0c9d832, accessed 2026.