Integrated Gradients

Integrated Gradients (IG) is a model interpretability technique introduced by Mukund Sundararajan, Ankur Taly, and Qiqi Yan in their 2017 ICML paper "Axiomatic Attribution for Deep Networks." ^[1] The method assigns an attribution score to every input feature of a neural network prediction, indicating how much that feature contributed to the model's output relative to a chosen reference input known as a baseline. Unlike earlier saliency map techniques that rely on raw gradients evaluated at a single point, IG averages gradients along a straight-line path from the baseline to the actual input. This integration step is what gives the method its name.

IG sits within the broader field of explainable AI and attribution methods, and it has become one of the most widely cited gradient-based attribution techniques for deep learning models. The paper's main contribution is an axiomatic framework: the authors defined two desirable properties for any attribution method, Sensitivity and Implementation Invariance, and showed that IG is the unique method (up to a family of path integrals) that satisfies both. This grounding is closely tied to the Aumann-Shapley value from cooperative game theory, which generalizes the discrete Shapley value to continuous settings.

IG has been adopted by major libraries including Captum for PyTorch, the Saliency library for TensorFlow, Alibi, and SHAP. It has been applied to image classification on networks like Inception and ResNet trained on ImageNet, to natural language tasks with BERT and other transformer models, to tabular data in finance and healthcare, and to genomics for identifying regulatory motifs in DNA. ^[2]^[3]

Mathematical formulation

Let F: R^n -> R be a differentiable function representing a neural network output (for example, a class probability or logit). Let x be the input of interest and x' be a chosen baseline. The Integrated Gradient for input feature i is defined as the path integral of the partial derivative of F with respect to feature i, taken along the straight-line interpolation between x' and x:

IG_i(x) = (x_i - x'_i) * integral from alpha=0 to 1 of dF(x' + alpha * (x - x')) / dx_i  d(alpha)

In words: scale each input dimension by the difference between the input and baseline values, and multiply by the average partial derivative of the model along the linear path between them. The factor (x_i - x'_i) ensures the attribution is zero when an input feature equals the baseline, and the integral averages out local fluctuations in the gradient caused by saturated activations or sharp nonlinearities. ^[1]

Numerical approximation

The integral is rarely computed analytically. In practice, IG is approximated by a Riemann sum with m discrete steps:

IG_i(x) ~ (x_i - x'_i) / m * sum from k=1 to m of dF(x' + (k/m) * (x - x')) / dx_i

Each step requires one forward pass and one backward pass through the network. The original paper recommends 20 to 300 steps depending on how much the model output changes between the baseline and input. ^[1] For convolutional networks like Inception V3 evaluated on ImageNet, 50 steps is a common default that converges within a few percent of completeness.

A simple worked example

Consider a one-dimensional ReLU network F(x) = max(0, x - 0.5) evaluated at x = 1.0 with baseline x' = 0.0. The model output is F(1.0) = 0.5 and F(0.0) = 0.0. The gradient dF/dx is 0 for x < 0.5 and 1 for x > 0.5. Along the path from 0 to 1 the gradient equals 1 for half the path and 0 for the other half, so the path-averaged gradient is 0.5. Multiplying by (x - x') = 1 gives an attribution of 0.5, exactly equal to F(x) - F(x'). Vanilla saliency, by contrast, would report a gradient of 1 evaluated at x = 1.0, ignoring the saturated region between 0 and 0.5 where the function was actually flat.

The axiomatic foundation

The central contribution of the IG paper is the introduction of two axioms that any attribution method should satisfy. The authors argue that prior literature had judged methods primarily by visual appeal rather than by formal properties, making it hard to know whether a saliency map captures model behavior or merely produces plausible-looking heatmaps. ^[1]

Sensitivity

The Sensitivity (a) axiom states: if an input x and baseline x' differ in a single feature, and the model output differs between them, then that feature must receive a non-zero attribution. The Sensitivity (b) axiom adds the converse condition that if a function does not mathematically depend on a variable, then attribution to that variable should be zero.

Vanilla saliency map methods, which compute dF/dx at the input point alone, can fail Sensitivity (a) because of saturated activations. In deep networks with many ReLU units, the gradient of a class score with respect to an input pixel can be exactly zero even when changing the pixel from black to its observed value would change the prediction. IG circumvents this by averaging gradients along the interpolation path, which guarantees that any feature contributing to the prediction difference receives non-zero credit. ^[1]

Implementation invariance

The Implementation Invariance axiom requires that two networks computing the exact same input-output function F should produce identical attributions, even if they have different internal architectures. Two networks are functionally equivalent if F_1(x) = F_2(x) for every input x.

This axiom is satisfied automatically by methods that depend only on input gradients, since the gradient of F is a property of the function, not of its implementation. Vanilla gradients and IG both satisfy it. DeepLIFT, proposed by Shrikumar, Greenside, and Kundaje, replaces the partial derivative with a reference difference quotient and propagates modified gradients backward through the network. ^[4] Because DeepLIFT rules depend on the choice of nonlinearity and on how layers are decomposed, two functionally equivalent networks with different layer structures can yield different attributions. The same critique applies to LRP (Bach et al. 2015). ^[5]

IG is therefore distinguished by satisfying both axioms simultaneously. Sundararajan and colleagues prove that, among the family of path integration methods that integrate gradients along some path from baseline to input, the linear path is the unique choice that satisfies a stronger set of axioms including Symmetry-Preserving (symmetric inputs receive equal attribution). ^[1]

Completeness and connection to Shapley values

IG satisfies a useful accounting property called Completeness: the per-feature attributions sum exactly to the difference in model output between the input and the baseline.

sum over i of IG_i(x) = F(x) - F(x')

This is a direct consequence of the fundamental theorem of calculus applied to the path integral, and it means the attributions can be interpreted as a decomposition of the prediction change into per-feature contributions. Completeness also makes IG comparable to SHAP and other attribution methods that produce a budgeted distribution of credit. ^[6]

The connection to game theory runs deeper than the surface analogy. IG is mathematically equivalent to the Aumann-Shapley value for the linear path between baseline and input. Aumann and Shapley developed this in 1974 as a continuous extension of the discrete Shapley value for cost allocation where players contribute fractional amounts. ^[7] In game-theoretic terms, IG distributes the payout F(x) - F(x') among features in a way that satisfies Efficiency, Linearity, Symmetry, and the Dummy axioms.

Sundararajan and Najmi's 2020 paper "The many Shapley values for model explanation" later clarified the relationship between IG, KernelSHAP, baseline Shapley, and other Shapley-derived methods, showing they correspond to different choices of game formulation. ^[8] IG's specific choice of a deterministic linear path makes it computationally tractable in continuous settings where a true expectation over feature subsets would require exponential evaluation.

Choice of baseline

The choice of baseline x' is the most consequential modeling decision in IG and the most discussed practical issue in the literature. The baseline defines what "absent" or "neutral" features mean. Different baselines produce different attributions, sometimes dramatically so.

The original paper recommends that the baseline should produce a near-zero output: F(x') ~ 0. For an image classifier this typically means that a baseline image should not be predicted as the target class. Common baseline choices include:

Baseline type	Description	Typical use case
Zero / black image	Input replaced with all zeros (or zero after normalization)	Default for image models, often used with Inception, ResNet
Random noise	Pixels sampled from Gaussian or uniform noise	Avoids systematic bias of black baseline; common alternative
Gaussian blur	Heavily blurred version of the input	Removes high-frequency content but preserves overall structure
Mean image	Average of training set	Represents an "average" example
Multiple baselines	Average IG over many baselines	Reduces baseline sensitivity, used in Expected Gradients
Padding token	Embedding of a special PAD or [MASK] token	Standard for transformer NLP models
Empty string embedding	Zero vector in embedding space	Alternative for text classification

The black-image baseline has been criticized because it is itself informative: a black region in a natural image is unusual, and the gradient with respect to a black pixel may emphasize edge detectors that fire on dark regions rather than truly indicating absence of the feature. ^[9] Many practitioners now use noise or multi-baseline approaches.

Erion and colleagues introduced Expected Gradients in 2021, which marginalizes over a distribution of baselines (typically the training distribution) by averaging IG attributions across sampled baselines. ^[10] This approximates the Aumann-Shapley value with respect to the data distribution and removes the need to pick a single arbitrary baseline.

For text models, the baseline is usually the embedding of a PAD or zero token, since the model itself only sees continuous embeddings. Attribution is computed with respect to the embedding vectors and then summed across embedding dimensions to give a scalar score per token.

Practical computation

A reference implementation of IG looks like this in pseudocode:

def integrated_gradients(model, x, x_prime, target_class, m=50):
    alphas = linspace(0, 1, m)
    interpolated = x_prime + alphas[:, None] * (x - x_prime)
    grads = []
    for x_alpha in interpolated:
        x_alpha.requires_grad = True
        output = model(x_alpha)[target_class]
        grad = autograd.grad(output, x_alpha)<sup><a href="#cite_note-0" class="cite-ref">[0]</a></sup>
        grads.append(grad)
    avg_grad = mean(grads, axis=0)
    attributions = (x - x_prime) * avg_grad
    return attributions

In production the interpolated inputs are batched, so a single forward pass computes outputs for all m points at once. Memory becomes a constraint at large m, and implementations often chunk the integration into mini-batches.

A common diagnostic is to verify completeness numerically: sum the attributions and compare against F(x) - F(x'). Captum exposes this via the return_convergence_delta=True flag in its IntegratedGradients class. Deltas above a few percent indicate the model contains discontinuities or that more steps are required.

IG requires the model to be differentiable along the path. For piecewise-linear networks (ReLU, max-pooling) the path is differentiable almost everywhere, and the Riemann sum converges. For non-differentiable operations like argmax, gradient smoothing tricks are needed.

Applications

IG has been deployed across many domains. The table below summarizes representative use cases.

Domain	Model architecture	Use of IG	Reference
Image classification	Inception V3, ResNet, VGG on ImageNet	Pixel-level saliency for class predictions	Sundararajan et al. 2017 ^[1]
NLP / question answering	BERT, T5, RoBERTa	Token-level attribution for answer spans and classifications	Mudrakarta et al. 2018 ^[11]
Tabular finance	Gradient-boosted trees and MLPs	Per-feature credit risk explanation	Erion et al. 2021 ^[10]
Healthcare	CNNs for medical imaging, MLPs for EHR	Clinical decision support and bias auditing	Lundberg et al. 2018 ^[12]
Genomics	DeepBind, DeepSEA, BPNet	Identifying regulatory motifs and DNA-binding sites	Avsec et al. 2021 ^[13]
Recommendation systems	Two-tower neural networks	Feature importance for item ranking	Sundararajan et al. 2019 ^[14]
Drug discovery	Graph neural networks	Atom-level attribution for molecular property prediction	Jimenez-Luna et al. 2020 ^[15]
Speech recognition	Wav2Vec, Conformer	Time-frequency attribution on spectrograms	Becker et al. 2018 ^[16]

In image work, IG is usually visualized as a heatmap overlay, with positive attributions in red and negative in blue. The maps tend to be smoother than vanilla saliency because integration averages out local gradient noise. Common practice is to take absolute attributions or sum across color channels.

For NLP tasks, IG attributions to token embeddings are summed across embedding dimensions to give a scalar score per token, visualized as heatmap-colored text. Researchers have used IG to identify spurious correlations in BERT-based models, such as overreliance on stopwords. ^[11]

In genomics, IG and DeepLIFT extract sequence motifs from convolutional networks trained on regulatory tasks. The DeepBind and BPNet papers use IG-style attributions to identify transcription factor binding motifs in DNA. ^[13]

Variants and extensions

Several extensions of IG have been proposed to address specific limitations.

Expected Gradients (Erion et al. 2021) replaces the single baseline with a distribution sampled from a reference set, often the training data. Averaging IG across these baselines yields the Aumann-Shapley value with respect to the data distribution. The approach reduces baseline sensitivity and can be used as an attribution-guided regularizer during training. ^[10]

Blur Integrated Gradients (Xu et al. 2020) replaces the linear path with a path through progressively blurred versions of the input. The intuition is that a Gaussian-blurred image is a more semantically meaningful "absence of features" baseline than a black image. The resulting attributions are smoother and align better with human perceptual judgments on natural images. ^[9]

Guided Integrated Gradients (Kapishnikov et al. 2021) chooses the integration path adaptively rather than using a straight line. At each step, the path moves the feature with the largest current gradient, rather than all features uniformly. Empirically, Guided IG produces sharper and less noisy attributions on image classifiers, at the cost of breaking the strict Aumann-Shapley interpretation. ^[17]

SmoothGrad-Integrated Gradients combines IG with SmoothGrad by averaging IG over multiple noisy versions of the input. ^[18] This reduces visual noise in the saliency map at the cost of additional compute. The combined method is sometimes called Smooth IG or SmoothGrad-IG.

Integrated Hessians (Janizek, Sturmfels, Lee 2021) extends IG to capture pairwise feature interactions by integrating second derivatives along the path. The result is a matrix of pairwise attributions that decomposes the prediction not just into per-feature credit but into per-feature-pair interactions, useful for diagnosing how a model combines features. ^[19]

XRAI (Kapishnikov et al. 2019) combines IG with image segmentation. After computing pixel-level IG, XRAI aggregates attributions over superpixel regions to produce region-level explanations that are more interpretable on natural images than pixel-level heatmaps. ^[20]

Limitations and critiques

Despite its theoretical grounding, IG has documented limitations. The most influential critique is the 2018 NeurIPS paper "Sanity Checks for Saliency Maps" by Adebayo and colleagues. ^[21] They proposed two tests: a parameter randomization test, where weights are randomized layer by layer, and a data randomization test, where labels are shuffled. A reliable method should produce attributions that change substantially when the model is randomized. They found that several popular methods, including some IG configurations depending on the baseline, produced visually similar attributions on randomly initialized and trained networks. The conclusion was that visual similarity to edge maps does not imply faithfulness to the model. This sparked more rigorous evaluation methodology across the saliency literature.

Other limitations include:

Path dependence. The straight-line path is one of many. The Aumann-Shapley value depends on the chosen path, and although the linear path is the unique symmetric one, alternatives (Blur IG, Guided IG) yield different attributions. The choice rests on axiomatic arguments and visual quality, not pure empirics.

Computational cost. A single IG attribution requires 20 to 300 forward and backward passes. For a large transformer with billions of parameters, this can be prohibitive. Approximate methods like sampling-based IG reduce cost.

Baseline sensitivity. As discussed above, attributions can change significantly with the choice of baseline. There is no universally correct baseline, and different fields have developed domain-specific conventions.

Discrete inputs. For categorical or token inputs the partial derivative is undefined. The workaround is to compute IG with respect to the continuous embedding and sum across embedding dimensions. The attribution is then over the embedding, not over the original token.

Limited feature interactions. Standard IG produces an additive decomposition into per-feature contributions. Real models combine features non-additively, and this structure is invisible in standard IG. Integrated Hessians and other extensions address this gap.

Adversarial fragility. Small perturbations that change predictions can produce IG attributions similar to the original, suggesting IG may not always reveal what the model truly responds to. ^[22]

Faithfulness vs intuitiveness. Visually appealing saliency maps are not always faithful to the model. IG occupies a middle ground that is more faithful than vanilla saliency but less so than deletion-based methods.

Comparison with other attribution methods

The following table compares IG with other widely used attribution methods. All of them produce per-feature attributions but differ in how they obtain those scores.

Method	Year	Approach	Implementation invariant	Completeness	Computational cost
Vanilla Saliency	2014	Gradient at input point	Yes	No	1 backward pass
Guided Backprop	2014	Gradient with negative values clipped	No	No	1 backward pass
Class Activation Map (CAM)	2016	Weighted sum of last-layer activations	No	No	1 forward pass
Grad-CAM	2017	Gradient-weighted activation map	Partially	No	1 backward pass
LRP	2015	Layer-wise relevance propagation rules	No	Yes (with constraints)	1 backward pass
DeepLIFT	2017	Reference-based modified backprop	No	Yes	1 backward pass
Integrated Gradients	2017	Path integral of gradients	Yes	Yes	20 to 300 passes
SmoothGrad	2017	Average gradients over noisy inputs	Yes	No	n forward+backward passes
LIME	2016	Local linear model around input	Model agnostic	Approximately	Many forward passes
SHAP (KernelSHAP)	2017	Sampled Shapley values	Model agnostic	Yes	Many forward passes
GradientSHAP	2017	IG averaged over baselines and noise	Yes	Yes	Many passes
Expected Gradients	2021	IG averaged over baseline distribution	Yes	Yes	Many passes
Blur IG	2020	IG along Gaussian blur path	Yes	Yes	20 to 300 passes
Guided IG	2021	IG along adaptively chosen path	Yes	Yes	20 to 300 passes

Saliency map methods (Simonyan et al. 2014) compute the absolute value of dF/dx at the input. They are cheap but suffer from gradient saturation and noise. ^[23]

DeepLIFT (Shrikumar et al. 2017) computes attributions by propagating reference activations backward through the network using rules that depend on the specific layer types. DeepLIFT is faster than IG (one backward pass vs many), but it sacrifices Implementation Invariance. ^[4]

LRP (Bach et al. 2015) propagates relevance backward through the network using conservation rules. Like DeepLIFT, it depends on the specific layer types and so is not implementation invariant. ^[5]

SHAP (Lundberg and Lee 2017) is a unifying framework that estimates the Shapley value of each input feature using sampling or kernel-based techniques. SHAP is model-agnostic, but its sampling cost grows quickly with the number of features. SHAP can also use IG-like techniques internally; the GradientExplainer in the SHAP library implements an IG variant. ^[6]

LIME (Ribeiro et al. 2016) fits a sparse linear model in the neighborhood of the input and uses its coefficients as attributions. LIME is also model-agnostic and works for any classifier, but is more sensitive to neighborhood definition and sampling. ^[24]

SmoothGrad (Smilkov et al. 2017) reduces gradient noise by averaging gradients over many noisy versions of the input. It is often combined with other gradient methods including IG to denoise the attribution maps. ^[18]

Implementations

IG is supported by all major interpretability libraries.

Captum is the official PyTorch interpretability library from Meta. Its IntegratedGradients class supports batched integration, multiple baselines, and convergence diagnostics. Captum also includes LayerIntegratedGradients for attributing to intermediate layer activations. ^[25]

TensorFlow Saliency library (Google's PAIR team) provides a NumPy-based implementation that interoperates with TensorFlow models. It also includes XRAI, Blur IG, and Guided IG. ^[26]

Alibi is a Python library focused on machine learning explanation, with IntegratedGradients built on top of TensorFlow and Keras. ^[27]

SHAP library includes a GradientExplainer class that approximates IG by averaging gradients along a path between input and a sampled baseline from a background dataset. It is similar to Expected Gradients in spirit. ^[6]

InterpretML (Microsoft) wraps several interpretability methods including IG-like approaches. ^[28]

All implementations share the same algorithm: take an input and a baseline, generate m interpolated inputs along the linear path, compute gradients at each, and combine them via a Riemann sum scaled by (x - x'). Differences arise in batching, layer-level attribution, and convergence diagnostics. For large transformers, layer-level IG is often more useful than input-level IG because token IDs have no natural baseline. Layer IG attributes to embedding outputs or attention activations, which admit natural zero baselines.

References

Sundararajan, M., Taly, A., & Yan, Q. (2017). "Axiomatic Attribution for Deep Networks." *Proceedings of the 34th International Conference on Machine Learning (ICML)*. https://arxiv.org/abs/1703.01365
Captum documentation. "Integrated Gradients." Meta AI. https://captum.ai/api/integrated_gradients.html
TensorFlow tutorials. "Integrated Gradients." https://www.tensorflow.org/tutorials/interpretability/integrated_gradients
Shrikumar, A., Greenside, P., & Kundaje, A. (2017). "Learning Important Features Through Propagating Activation Differences." *ICML 2017*. https://arxiv.org/abs/1704.02685
Bach, S., Binder, A., Montavon, G., Klauschen, F., Muller, K. R., & Samek, W. (2015). "On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation." *PLOS ONE*. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140
Lundberg, S. M., & Lee, S. I. (2017). "A Unified Approach to Interpreting Model Predictions." *NeurIPS 2017*. https://arxiv.org/abs/1705.07874
Aumann, R. J., & Shapley, L. S. (1974). *Values of Non-Atomic Games*. Princeton University Press. https://press.princeton.edu/books/hardcover/9780691645469/values-of-non-atomic-games
Sundararajan, M., & Najmi, A. (2020). "The Many Shapley Values for Model Explanation." *ICML 2020*. https://arxiv.org/abs/1908.08474
Xu, S., Venugopalan, S., & Sundararajan, M. (2020). "Attribution in Scale and Space." *CVPR 2020*. https://arxiv.org/abs/2004.03383
Erion, G., Janizek, J. D., Sturmfels, P., Lundberg, S. M., & Lee, S. I. (2021). "Improving Performance of Deep Learning Models with Axiomatic Attribution Priors and Expected Gradients." *Nature Machine Intelligence*. https://arxiv.org/abs/1906.10670
Mudrakarta, P. K., Taly, A., Sundararajan, M., & Dhamdhere, K. (2018). "Did the Model Understand the Question?" *ACL 2018*. https://arxiv.org/abs/1805.05492
Lundberg, S. M., Nair, B., Vavilala, M. S., et al. (2018). "Explainable Machine-Learning Predictions for the Prevention of Hypoxaemia During Surgery." *Nature Biomedical Engineering*. https://www.nature.com/articles/s41551-018-0304-0
Avsec, Z., Weilert, M., Shrikumar, A., et al. (2021). "Base-Resolution Models of Transcription-Factor Binding Reveal Soft Motif Syntax." *Nature Genetics*. https://www.nature.com/articles/s41588-021-00782-6
Sundararajan, M., Xu, J., Taly, A., Mukund, S., & Najmi, A. (2019). "Exploring Principled Visualizations for Deep Network Attributions." *IUI Workshops*. https://research.google/pubs/pub47800/
Jimenez-Luna, J., Grisoni, F., & Schneider, G. (2020). "Drug Discovery with Explainable Artificial Intelligence." *Nature Machine Intelligence*. https://www.nature.com/articles/s42256-020-00236-4
Becker, S., Ackermann, M., Lapuschkin, S., Muller, K. R., & Samek, W. (2018). "Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals." arXiv preprint. https://arxiv.org/abs/1807.03418
Kapishnikov, A., Venugopalan, S., Avci, B., Wedin, B., Terry, M., & Bolukbasi, T. (2021). "Guided Integrated Gradients: An Adaptive Path Method for Removing Noise." *CVPR 2021*. https://arxiv.org/abs/2106.09788
Smilkov, D., Thorat, N., Kim, B., Viegas, F., & Wattenberg, M. (2017). "SmoothGrad: Removing Noise by Adding Noise." arXiv preprint. https://arxiv.org/abs/1706.03825
Janizek, J. D., Sturmfels, P., & Lee, S. I. (2021). "Explaining Explanations: Axiomatic Feature Interactions for Deep Networks." *Journal of Machine Learning Research*. https://arxiv.org/abs/2002.04138
Kapishnikov, A., Bolukbasi, T., Viegas, F., & Terry, M. (2019). "XRAI: Better Attributions Through Regions." *ICCV 2019*. https://arxiv.org/abs/1906.02825
Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). "Sanity Checks for Saliency Maps." *NeurIPS 2018*. https://arxiv.org/abs/1810.03292
Ghorbani, A., Abid, A., & Zou, J. (2019). "Interpretation of Neural Networks is Fragile." *AAAI 2019*. https://arxiv.org/abs/1710.10547
Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps." *ICLR Workshop*. https://arxiv.org/abs/1312.6034
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why Should I Trust You? Explaining the Predictions of Any Classifier." *KDD 2016*. https://arxiv.org/abs/1602.04938
Captum library on GitHub. https://github.com/pytorch/captum
PAIR-code Saliency library on GitHub. https://github.com/PAIR-code/saliency
Alibi library documentation. "Integrated Gradients." Seldon. https://docs.seldon.io/projects/alibi/en/stable/methods/IntegratedGradients.html
InterpretML on GitHub. https://github.com/interpretml/interpret
Molnar, C. (2022). *Interpretable Machine Learning: A Guide for Making Black Box Models Explainable*. Chapter on gradient-based attribution. https://christophm.github.io/interpretable-ml-book/
Ancona, M., Ceolini, E., Oztireli, C., & Gross, M. (2018). "Towards Better Understanding of Gradient-Based Attribution Methods for Deep Neural Networks." *ICLR 2018*. https://arxiv.org/abs/1711.06104

Integrated Gradients

Mathematical formulation

Numerical approximation

A simple worked example

The axiomatic foundation

Sensitivity

Implementation invariance

Completeness and connection to Shapley values

Choice of baseline

Practical computation

Applications

Variants and extensions

Limitations and critiques

Comparison with other attribution methods

Implementations

See also

References

Improve this article

Mathematical formulation

Numerical approximation

A simple worked example

The axiomatic foundation

Sensitivity

Implementation invariance

Completeness and connection to Shapley values

Choice of baseline

Practical computation

Applications

Variants and extensions

Limitations and critiques

Comparison with other attribution methods

Implementations

See also

References

Mathematical formulation

Numerical approximation

A simple worked example

The axiomatic foundation

Sensitivity

Implementation invariance

Completeness and connection to Shapley values

Choice of baseline

Practical computation

Applications

Variants and extensions

Limitations and critiques

Comparison with other attribution methods

Implementations

See also

References

Improve this article

Related Articles

DeepLIFT

Layer-wise Relevance Propagation (LRP)

ARC-AGI 2

Explainable AI

SHAP (SHapley Additive exPlanations)

LIME

Mathematical formulation

Numerical approximation

A simple worked example

The axiomatic foundation

Sensitivity

Implementation invariance

Completeness and connection to Shapley values

Choice of baseline

Practical computation

Applications

Variants and extensions

Limitations and critiques

Comparison with other attribution methods

Implementations

See also

References

Related Articles

DeepLIFT

Layer-wise Relevance Propagation (LRP)

ARC-AGI 2

Explainable AI

SHAP (SHapley Additive exPlanations)

LIME