# DeepLIFT

> Source: https://aiwiki.ai/wiki/deeplift
> Updated: 2026-04-30
> Categories: Deep Learning, Interpretability, Machine Learning
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**DeepLIFT** (Deep Learning Important FeaTures) is a [feature attribution](/wiki/feature_attribution) method for deep neural networks introduced by [Avanti Shrikumar](/wiki/avanti_shrikumar), [Peyton Greenside](/wiki/peyton_greenside), and [Anshul Kundaje](/wiki/anshul_kundaje) at [Stanford University](/wiki/stanford_university) in 2017 [1]. It assigns each input feature a contribution score by comparing every neuron's activation to a reference activation from a baseline input, then propagating these "differences from reference" backward through the network using modified chain rules. DeepLIFT addresses two failure modes of gradient-based saliency methods, namely gradient saturation and the discontinuities introduced by piecewise-linear activations, and it became a core influence on the [SHAP](/wiki/shap) library's `DeepSHAP` variant and on the broader [interpretability](/wiki/interpretability) literature [2].

DeepLIFT belongs to a 2017 wave of [attribution methods](/wiki/attribution_methods) that put deep network interpretability on more principled footing. Its single-backward-pass formulation makes it efficient compared to path-integral methods such as [Integrated Gradients](/wiki/integrated_gradients), and its reference baseline made it especially well suited to [genomics](/wiki/genomics) and other domains where a "null" input has clear scientific meaning. The method was presented at [ICML](/wiki/icml) 2017 with the reference implementation hosted in the Kundaje lab GitHub repository [3].

## Background

Before DeepLIFT, the dominant tools for explaining predictions of deep [convolutional neural networks](/wiki/convolutional_neural_network) were gradient-based saliency techniques. Simonyan, Vedaldi, and Zisserman introduced the [saliency map](/wiki/saliency_map) in 2014 by visualizing the gradient of a class score with respect to input pixels [4]. The intuition is reasonable: if a small perturbation to a pixel changes the output a lot, that pixel is locally important. Vanilla gradients have two well-documented problems.

The first is **gradient saturation**. Activation functions including [sigmoid](/wiki/sigmoid_function), tanh, and [ReLU](/wiki/relu) on the negative side have flat regions where the local gradient is near zero. A firmly saturated neuron still carries information about the prediction, yet its gradient assigns no importance to the inputs driving it. The second is **thresholding discontinuities**. Piecewise-linear functions such as ReLU produce gradients that flip abruptly at zero, causing attribution scores to swing wildly between nearly identical inputs.

A different lineage, exemplified by **Layer-wise Relevance Propagation** ([LRP](/wiki/lrp)) from Bach et al. 2015, treated attribution as a backward flow of conserved relevance through the network [5]. LRP's per-layer rules yielded more faithful explanations than raw gradients but lacked a clean link to a baseline-relative contribution. DeepLIFT combined the propagation viewpoint of LRP with an explicit reference activation, giving it a clean conservation property analogous to the completeness axiom Sundararajan, Taly, and Yan formalized for Integrated Gradients in the same year [6].

## Core idea

Let `f` be a neural network and let `x` be an input whose prediction we want to explain. DeepLIFT requires the analyst to choose a **reference input** `x'`, also called the baseline, meant to encode the absence of meaningful signal: an all-zero image, the training set mean, an all-padding token sequence, or a shuffled DNA sequence.

For any neuron `t`, write `t(x)` for its activation on the input and `t(x')` for its activation on the reference. The activation difference is

`Delta_t = t(x) - t(x')`.

DeepLIFT's central object is the **contribution score** `C(Delta_x_i, Delta_t)`, representing how much the input feature change `Delta_x_i = x_i - x'_i` contributes to the change in target `Delta_t`. The defining property is **summation-to-delta**, also called completeness:

`sum_i C(Delta_x_i, Delta_t) = Delta_t`.

Contribution scores assigned to input features add up exactly to the change in output activation between input and reference. This is the same conservation idea LRP imposes by construction and that Integrated Gradients proves as an axiom.

Closely related is the **multiplier**

`m(Delta_x, Delta_t) = C(Delta_x, Delta_t) / Delta_x`,

the difference-from-reference analog of a partial derivative. Where the gradient measures `partial t / partial x` at a single point, the multiplier measures the average rate of change of `t` along the displacement from `x'` to `x`. Because it is defined over a finite difference, it does not vanish in saturated regions. DeepLIFT exploits a discrete chain rule

`m(Delta_x, Delta_y) = sum_h m(Delta_x, Delta_h) * m(Delta_h, Delta_y)`

to propagate multipliers backward through hidden units `h`. A single backward pass recovers contribution scores for every input feature, which makes DeepLIFT cheap relative to path-integral methods [1].

## Propagation rules

DeepLIFT specifies three propagation rules that together cover layer types in standard convolutional and feedforward architectures.

### Linear rule

For linear layers, including fully connected layers and convolutions, the contribution of input difference `Delta_x_i` to pre-activation difference `Delta_z = sum_j w_{ij} * Delta_x_j` is exactly `w_{ij} * Delta_x_j`, and the multiplier is simply the weight `w_{ij}`. Linearity guarantees both completeness and a unique decomposition, so the linear rule reduces DeepLIFT to ordinary backpropagation along weight matrices. Some variants distinguish positive and negative parts of `Delta_x` so activation suppression can be tracked alongside activation increase.

### Rescale rule

The rescale rule applies to elementwise nonlinear activations such as ReLU, sigmoid, tanh, and [softmax](/wiki/softmax). For a nonlinearity `y = g(x)`, the multiplier is

`m(Delta_x, Delta_y) = Delta_y / Delta_x`,

the slope of the secant line through `(x', g(x'))` and `(x, g(x))`. This finite-difference slope is well defined whenever `Delta_x` is nonzero, and it is precisely the quantity vanilla gradients fail to capture in saturated regions. When `Delta_x` is zero, the rule falls back to the local derivative `g'(x')`.

### RevealCancel rule

The rescale rule treats positive and negative input contributions symmetrically, which can hide cancellations. Consider a ReLU whose input is the sum of a strong positive contribution `+5` and a strong negative contribution `-4`. The net input difference is `+1`, and rescale attributes the contribution to the net flow. RevealCancel instead splits `Delta_x` into positive and negative parts and propagates each through the activation independently, so apparent insensitivity to a feature can be revealed as a cancellation of two competing effects. This matters most for networks with large positive-negative interactions and for genomics models with long-range additive structure [1].

## Properties

DeepLIFT was designed to satisfy a small set of properties the authors argue are minimal requirements for a faithful attribution method.

**Completeness (summation-to-delta).** Contribution scores sum exactly to the change in output activation between input and reference. This is the same property that Sundararajan, Taly, and Yan proved for Integrated Gradients [6] and that LRP enforces by construction.

**Sensitivity.** If the network's output differs between `x` and `x'` only because of a single feature, that feature should receive nonzero contribution. Vanilla gradients can fail this in saturated regions, while DeepLIFT's finite-difference formulation preserves it.

**Implementation invariance.** Two networks computing the same function produce the same attributions. DeepLIFT inherits this property from the chain-rule structure of its multipliers.

**Efficiency.** A single backward pass computes attributions, in contrast to path-integral methods that require dozens to hundreds of forward evaluations.

**Reference dependence.** DeepLIFT scores are explicitly relative to a baseline. This is a feature when the analyst has a clear notion of "absence," and a problem when no obvious baseline exists.

## Choice of reference

The reference is the most consequential choice in a DeepLIFT analysis. Because scores are defined as differences from baseline, the same model can produce very different attributions depending on what the analyst calls "absence" of signal.

- **Zero reference.** The all-zero vector works when zero genuinely means "nothing here," such as one-hot encoded categorical features. It is a poor choice for natural images, where an all-black image is itself out-of-distribution.
- **Mean reference.** Replacing each feature with its training-set mean produces a baseline close to the data manifold. For images, this is a uniform gray frame.
- **Random reference.** Noise drawn from a fixed distribution, useful when no canonical baseline exists.
- **Multiple references.** Averaging DeepLIFT scores over a distribution of references gives an expected-attribution variant related to [Expected Gradients](/wiki/expected_gradients) [7]. This is the core of `DeepSHAP`, where the reference distribution is chosen so averaged scores approximate Shapley values.
- **Domain-specific references.** In NLP, padding or `[MASK]` tokens often serve as natural baselines. In genomics, dinucleotide-shuffled sequences preserve local composition while destroying motif structure, providing a meaningful null. The Kundaje lab's tooling, including [TF-MoDISco](/wiki/tf_modisco), averages attributions over many shuffled references to stabilize motif discovery [8].

## Comparison with related methods

| Method | Year | Core mechanism | Baseline | Cost | Saturation |
|---|---|---|---|---|---|
| Saliency map | 2014 | Vanilla gradient at input | No | One backward pass | Fails |
| LRP | 2015 | Conservation rules per layer | Implicit | One backward pass | Partial |
| LIME | 2016 | Local surrogate model | Perturbations | Many forward passes | Indirect |
| DeepLIFT | 2017 | Multipliers vs reference | Yes, explicit | One backward pass | Robust |
| Integrated Gradients | 2017 | Path integral of gradients | Yes, explicit | Many forward passes | Robust |
| SHAP / DeepSHAP | 2017 | Shapley values, DeepLIFT-based | Reference distribution | Many backward passes | Robust |
| Grad-CAM | 2017 | Weighted feature-map gradients | No | One backward pass | Spatial only |
| SmoothGrad | 2017 | Averaged gradients over noisy inputs | No | Many forward passes | Indirect |

The closest relative is [Integrated Gradients](/wiki/integrated_gradients). Both satisfy completeness, require an explicit baseline, and produce attributions for arbitrary differentiable models. The practical difference is computational: Integrated Gradients integrates along a straight-line path from baseline to input, typically requiring fifty to two hundred forward and backward evaluations, while DeepLIFT replaces the path integral with a single chain-rule pass using secant-line slopes. The two agree exactly in linear models and closely on many nonlinear architectures, especially when DeepLIFT is averaged over a small reference distribution.

The relationship to [SHAP](/wiki/shap) is tighter. Lundberg and Lee's 2017 paper introduced SHAP as a unifying view grounded in [Shapley values](/wiki/shapley_value) and showed that DeepLIFT's propagation rules can be reinterpreted as an efficient approximation of Shapley values when averaged over a reference distribution [2]. The resulting algorithm, `DeepSHAP`, is one of the most widely used deep model explainers in production today. [Grad-CAM](/wiki/grad_cam) and [SmoothGrad](/wiki/smoothgrad) target different problems: Grad-CAM produces coarse spatial heatmaps from CNN feature maps, while SmoothGrad denoises vanilla gradients by averaging over Gaussian-perturbed inputs.

## Implementations

The original DeepLIFT implementation is the `kundajelab/deeplift` repository, written by Avanti Shrikumar and maintained by the Kundaje lab at Stanford [3]. Early versions were built on [Theano](/wiki/theano) with a [Keras](/wiki/keras) frontend; the codebase later moved to Keras-on-TensorFlow. It includes the linear, rescale, and RevealCancel rules along with utilities for reference distributions and integration with TF-MoDISco.

[Captum](/wiki/captum), Meta's open-source [PyTorch](/wiki/pytorch) interpretability library, ships `DeepLift` and `DeepLiftShap` classes following the same specification [12]. The SHAP library exposes DeepLIFT-derived methods through its `DeepExplainer` class [13], which uses backward hooks to override the gradients of nonlinear layers with DeepLIFT's secant-line slopes and averages attributions over a reference distribution to approximate Shapley values. This is how most practitioners interact with DeepLIFT today, often without realizing the underlying engine is Shrikumar's algorithm.

Because DeepLIFT modifies the backward pass for every nonlinearity, every new layer type needs an explicit propagation rule. Standard libraries cover convolutions, dense layers, ReLU, sigmoid, tanh, max-pooling, and batch normalization. Less common operations such as attention and layer normalization need careful adaptation or fall back to the local gradient.

## Applications

### Regulatory genomics

DeepLIFT's most consequential application is in [regulatory genomics](/wiki/regulatory_genomics). Convolutional networks trained on DNA sequences can predict whether a sequence will be bound by a [transcription factor](/wiki/transcription_factor), accessible to chromatin remodeling, or transcriptionally active. Biologists want to know which subsequences drove a positive call, since those correspond to candidate binding motifs. DeepLIFT contribution scores per nucleotide, averaged over shuffled-sequence references, provide exactly this information.

[TF-MoDISco](/wiki/tf_modisco), introduced by Shrikumar and colleagues from 2018 to 2020, builds on DeepLIFT by clustering high-contribution windows ("seqlets") across many input sequences into recurring motif patterns [8]. It has been used to rediscover known binding sites and propose new ones from models trained on chromatin accessibility and ChIP-seq data. [BPNet](/wiki/bpnet), from Avsec and colleagues in 2021, is a base-pair-resolution prediction model that pairs convolutional architectures with DeepLIFT and TF-MoDISco interpretation [9], and has become a standard design for sequence-to-profile genomic models.

### Other domains

In image classification, DeepLIFT is a baseline but has been largely displaced by Integrated Gradients, Grad-CAM, and Shapley variants. In NLP, attention rollout, Integrated Gradients, and attribution-patching have won out for [transformer](/wiki/transformer) architectures. In healthcare and tabular prediction, DeepLIFT and DeepSHAP remain common choices for explaining feedforward and [recurrent](/wiki/recurrent_neural_network) models, especially when regulators expect a Shapley-style decomposition.

## Variants

- **DeepSHAP / DeepLiftShap.** The most widely used variant. Combines DeepLIFT's propagation with averaging over a reference distribution to approximate Shapley values [2].
- **Expected DeepLIFT.** DeepLIFT averaged over a stochastic baseline distribution, closely related to Erion et al.'s Expected Gradients [7].
- **TF-MoDISco.** A downstream tool that clusters DeepLIFT contribution tracks into motif patterns; it forms the standard interpretation stack for sequence models in regulatory genomics [8].
- **Recurrent variants.** DeepLIFT rules have been adapted for recurrent networks by unrolling the recurrence and treating each timestep's hidden state as a separate layer.
- **Transformer extensions.** Some authors have adapted DeepLIFT-style propagation to transformer blocks by treating attention as a soft routing operation. These adaptations are less standardized and have not displaced attention-based or path-integral methods.

## Limitations

**Reference sensitivity.** Different baselines produce different attributions, sometimes dramatically. There is no universal recipe for choosing a reference, and the choice encodes the analyst's prior about what "absence of signal" means.

**Implementation overhead.** Because DeepLIFT modifies the backward pass for every nonlinearity, it requires per-layer propagation rules. Architectures with custom activations, attention, batch normalization, or residual connections need explicit rule choices.

**Sanity check failures.** Adebayo et al. 2018 ran sanity checks comparing attribution maps before and after randomizing model weights [10]. Several attribution methods, including some configurations of DeepLIFT and Integrated Gradients, produced visually similar maps for trained and randomly initialized networks under certain baseline choices. The result has not invalidated DeepLIFT in well-defined domains such as genomics, where attributions are validated against known biological motifs, but it sharpened community caution about overinterpreting saliency outputs.

**Less popular than IG in modern interpretability.** Integrated Gradients has overtaken DeepLIFT in many vision and NLP benchmarks, partly because IG ships with most major frameworks as a one-line call. DeepLIFT remains dominant in genomics and in DeepSHAP-based production pipelines.

## Reception and historical context

The ICML 2017 DeepLIFT paper has accumulated several thousand citations and is a foundational reference for [explainable AI](/wiki/explainable_ai) in [deep learning](/wiki/deep_learning). Its influence runs along three lines: it set the template for difference-from-reference attribution that SHAP later generalized to a Shapley-value framework; it became the standard interpretation tool in regulatory genomics through TF-MoDISco and BPNet; and it sits alongside Integrated Gradients and LRP in nearly every survey of attribution methods. Outside genomics, the footprint shows up most strongly through SHAP's `DeepExplainer`.

The year 2017 was a watershed for attribution. Integrated Gradients appeared in March [6], the revised DeepLIFT paper in April [1] (building on a 2016 preprint, "Not Just a Black Box" [11]), and SHAP later that year [2], retrospectively casting DeepLIFT and Integrated Gradients as approximations of Shapley values. The Kundaje lab was a fertile environment for DeepLIFT because it sits at the intersection of deep learning and computational biology, and its work on convolutional models for chromatin accessibility and transcription-factor binding created a constant need for interpretation tools that could surface biologically meaningful patterns.

## See also

- [Integrated Gradients](/wiki/integrated_gradients)
- [SHAP](/wiki/shap)
- [LRP](/wiki/lrp)
- [LIME](/wiki/lime)
- [Saliency map](/wiki/saliency_map)
- [Grad-CAM](/wiki/grad_cam)
- [SmoothGrad](/wiki/smoothgrad)
- [Feature attribution](/wiki/feature_attribution)
- [Interpretability](/wiki/interpretability)
- [Explainable AI](/wiki/explainable_ai)
- [TF-MoDISco](/wiki/tf_modisco)
- [BPNet](/wiki/bpnet)
- [Axiomatic attribution](/wiki/axiomatic_attribution)
- [Expected Gradients](/wiki/expected_gradients)
- [Shapley value](/wiki/shapley_value)

## References

1. Shrikumar, A., Greenside, P., & Kundaje, A. (2017). Learning Important Features Through Propagating Activation Differences. *ICML 2017*. arXiv:1704.02685. https://arxiv.org/abs/1704.02685
2. Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. *NeurIPS 2017*. arXiv:1705.07874. https://arxiv.org/abs/1705.07874
3. Kundaje Lab. DeepLIFT reference implementation. GitHub. https://github.com/kundajelab/deeplift
4. Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Deep Inside Convolutional Networks. *ICLR Workshop*. arXiv:1312.6034. https://arxiv.org/abs/1312.6034
5. Bach, S., Binder, A., Montavon, G., Klauschen, F., Muller, K.-R., & Samek, W. (2015). On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. *PLOS ONE*, 10(7), e0130140. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140
6. Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic Attribution for Deep Networks. *ICML 2017*. arXiv:1703.01365. https://arxiv.org/abs/1703.01365
7. Erion, G., Janizek, J. D., Sturmfels, P., Lundberg, S. M., & Lee, S.-I. (2021). Improving performance of deep learning models with axiomatic attribution priors and expected gradients. *Nature Machine Intelligence*, 3, 620-631. https://arxiv.org/abs/1906.10670
8. Shrikumar, A., Tian, K., Avsec, Z., Shcherbina, A., Banerjee, A., Sharmin, M., Nair, S., & Kundaje, A. (2020). Technical Note on TF-MoDISco. arXiv:1811.00416. https://arxiv.org/abs/1811.00416
9. Avsec, Z., Weilert, M., Shrikumar, A., et al. (2021). Base-resolution models of transcription-factor binding reveal soft motif syntax. *Nature Genetics*, 53, 354-366. https://www.nature.com/articles/s41588-021-00782-6
10. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). Sanity Checks for Saliency Maps. *NeurIPS 2018*. arXiv:1810.03292. https://arxiv.org/abs/1810.03292
11. Shrikumar, A., Greenside, P., Shcherbina, A., & Kundaje, A. (2016). Not Just a Black Box: Learning Important Features Through Propagating Activation Differences. arXiv:1605.01713. https://arxiv.org/abs/1605.01713
12. Captum: Model Interpretability for PyTorch. Meta AI. https://captum.ai/
13. SHAP (SHapley Additive exPlanations) Library. GitHub. https://github.com/shap/shap
14. Smilkov, D., Thorat, N., Kim, B., Viegas, F., & Wattenberg, M. (2017). SmoothGrad: removing noise by adding noise. arXiv:1706.03825. https://arxiv.org/abs/1706.03825
15. Selvaraju, R. R., et al. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. *ICCV 2017*. arXiv:1610.02391. https://arxiv.org/abs/1610.02391