DeepLIFT

DeepLIFT (Deep Learning Important FeaTures) is a feature attribution method for deep neural networks introduced by Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje at Stanford University in 2017 ^[1]. It assigns each input feature a contribution score by comparing every neuron's activation to a reference activation from a baseline input, then propagating these "differences from reference" backward through the network using modified chain rules. DeepLIFT addresses two failure modes of gradient-based saliency methods, namely gradient saturation and the discontinuities introduced by piecewise-linear activations, and it became a core influence on the SHAP library's DeepSHAP variant and on the broader interpretability literature ^[2].

DeepLIFT belongs to a 2017 wave of attribution methods that put deep network interpretability on more principled footing. Its single-backward-pass formulation makes it efficient compared to path-integral methods such as Integrated Gradients, and its reference baseline made it especially well suited to genomics and other domains where a "null" input has clear scientific meaning. The method was presented at ICML 2017 with the reference implementation hosted in the Kundaje lab GitHub repository ^[3].

Background

Before DeepLIFT, the dominant tools for explaining predictions of deep convolutional neural networks were gradient-based saliency techniques. Simonyan, Vedaldi, and Zisserman introduced the saliency map in 2014 by visualizing the gradient of a class score with respect to input pixels ^[4]. The intuition is reasonable: if a small perturbation to a pixel changes the output a lot, that pixel is locally important. Vanilla gradients have two well-documented problems.

The first is gradient saturation. Activation functions including sigmoid, tanh, and ReLU on the negative side have flat regions where the local gradient is near zero. A firmly saturated neuron still carries information about the prediction, yet its gradient assigns no importance to the inputs driving it. The second is thresholding discontinuities. Piecewise-linear functions such as ReLU produce gradients that flip abruptly at zero, causing attribution scores to swing wildly between nearly identical inputs.

A different lineage, exemplified by Layer-wise Relevance Propagation (LRP) from Bach et al. 2015, treated attribution as a backward flow of conserved relevance through the network ^[5]. LRP's per-layer rules yielded more faithful explanations than raw gradients but lacked a clean link to a baseline-relative contribution. DeepLIFT combined the propagation viewpoint of LRP with an explicit reference activation, giving it a clean conservation property analogous to the completeness axiom Sundararajan, Taly, and Yan formalized for Integrated Gradients in the same year ^[6].

Core idea

Let f be a neural network and let x be an input whose prediction we want to explain. DeepLIFT requires the analyst to choose a reference input x', also called the baseline, meant to encode the absence of meaningful signal: an all-zero image, the training set mean, an all-padding token sequence, or a shuffled DNA sequence.

For any neuron t, write t(x) for its activation on the input and t(x') for its activation on the reference. The activation difference is

Delta_t = t(x) - t(x').

DeepLIFT's central object is the contribution score C(Delta_x_i, Delta_t), representing how much the input feature change Delta_x_i = x_i - x'_i contributes to the change in target Delta_t. The defining property is summation-to-delta, also called completeness:

sum_i C(Delta_x_i, Delta_t) = Delta_t.

Contribution scores assigned to input features add up exactly to the change in output activation between input and reference. This is the same conservation idea LRP imposes by construction and that Integrated Gradients proves as an axiom.

Closely related is the multiplier

m(Delta_x, Delta_t) = C(Delta_x, Delta_t) / Delta_x,

the difference-from-reference analog of a partial derivative. Where the gradient measures partial t / partial x at a single point, the multiplier measures the average rate of change of t along the displacement from x' to x. Because it is defined over a finite difference, it does not vanish in saturated regions. DeepLIFT exploits a discrete chain rule

m(Delta_x, Delta_y) = sum_h m(Delta_x, Delta_h) * m(Delta_h, Delta_y)

to propagate multipliers backward through hidden units h. A single backward pass recovers contribution scores for every input feature, which makes DeepLIFT cheap relative to path-integral methods ^[1].

Propagation rules

DeepLIFT specifies three propagation rules that together cover layer types in standard convolutional and feedforward architectures.

Linear rule

For linear layers, including fully connected layers and convolutions, the contribution of input difference Delta_x_i to pre-activation difference Delta_z = sum_j w_{ij} * Delta_x_j is exactly w_{ij} * Delta_x_j, and the multiplier is simply the weight w_{ij}. Linearity guarantees both completeness and a unique decomposition, so the linear rule reduces DeepLIFT to ordinary backpropagation along weight matrices. Some variants distinguish positive and negative parts of Delta_x so activation suppression can be tracked alongside activation increase.

Rescale rule

The rescale rule applies to elementwise nonlinear activations such as ReLU, sigmoid, tanh, and softmax. For a nonlinearity y = g(x), the multiplier is

m(Delta_x, Delta_y) = Delta_y / Delta_x,

the slope of the secant line through (x', g(x')) and (x, g(x)). This finite-difference slope is well defined whenever Delta_x is nonzero, and it is precisely the quantity vanilla gradients fail to capture in saturated regions. When Delta_x is zero, the rule falls back to the local derivative g'(x').

RevealCancel rule

The rescale rule treats positive and negative input contributions symmetrically, which can hide cancellations. Consider a ReLU whose input is the sum of a strong positive contribution +5 and a strong negative contribution -4. The net input difference is +1, and rescale attributes the contribution to the net flow. RevealCancel instead splits Delta_x into positive and negative parts and propagates each through the activation independently, so apparent insensitivity to a feature can be revealed as a cancellation of two competing effects. This matters most for networks with large positive-negative interactions and for genomics models with long-range additive structure ^[1].

Properties

DeepLIFT was designed to satisfy a small set of properties the authors argue are minimal requirements for a faithful attribution method.

Completeness (summation-to-delta). Contribution scores sum exactly to the change in output activation between input and reference. This is the same property that Sundararajan, Taly, and Yan proved for Integrated Gradients ^[6] and that LRP enforces by construction.

Sensitivity. If the network's output differs between x and x' only because of a single feature, that feature should receive nonzero contribution. Vanilla gradients can fail this in saturated regions, while DeepLIFT's finite-difference formulation preserves it.

Implementation invariance. Two networks computing the same function produce the same attributions. DeepLIFT inherits this property from the chain-rule structure of its multipliers.

Efficiency. A single backward pass computes attributions, in contrast to path-integral methods that require dozens to hundreds of forward evaluations.

Reference dependence. DeepLIFT scores are explicitly relative to a baseline. This is a feature when the analyst has a clear notion of "absence," and a problem when no obvious baseline exists.

Choice of reference

The reference is the most consequential choice in a DeepLIFT analysis. Because scores are defined as differences from baseline, the same model can produce very different attributions depending on what the analyst calls "absence" of signal.

Zero reference. The all-zero vector works when zero genuinely means "nothing here," such as one-hot encoded categorical features. It is a poor choice for natural images, where an all-black image is itself out-of-distribution.
Mean reference. Replacing each feature with its training-set mean produces a baseline close to the data manifold. For images, this is a uniform gray frame.
Random reference. Noise drawn from a fixed distribution, useful when no canonical baseline exists.
Multiple references. Averaging DeepLIFT scores over a distribution of references gives an expected-attribution variant related to Expected Gradients ^[7]. This is the core of DeepSHAP, where the reference distribution is chosen so averaged scores approximate Shapley values.
Domain-specific references. In NLP, padding or [MASK] tokens often serve as natural baselines. In genomics, dinucleotide-shuffled sequences preserve local composition while destroying motif structure, providing a meaningful null. The Kundaje lab's tooling, including TF-MoDISco, averages attributions over many shuffled references to stabilize motif discovery ^[8].

Method	Year	Core mechanism	Baseline	Cost	Saturation
Saliency map	2014	Vanilla gradient at input	No	One backward pass	Fails
LRP	2015	Conservation rules per layer	Implicit	One backward pass	Partial
LIME	2016	Local surrogate model	Perturbations	Many forward passes	Indirect
DeepLIFT	2017	Multipliers vs reference	Yes, explicit	One backward pass	Robust
Integrated Gradients	2017	Path integral of gradients	Yes, explicit	Many forward passes	Robust
SHAP / DeepSHAP	2017	Shapley values, DeepLIFT-based	Reference distribution	Many backward passes	Robust
Grad-CAM	2017	Weighted feature-map gradients	No	One backward pass	Spatial only
SmoothGrad	2017	Averaged gradients over noisy inputs	No	Many forward passes	Indirect

The closest relative is Integrated Gradients. Both satisfy completeness, require an explicit baseline, and produce attributions for arbitrary differentiable models. The practical difference is computational: Integrated Gradients integrates along a straight-line path from baseline to input, typically requiring fifty to two hundred forward and backward evaluations, while DeepLIFT replaces the path integral with a single chain-rule pass using secant-line slopes. The two agree exactly in linear models and closely on many nonlinear architectures, especially when DeepLIFT is averaged over a small reference distribution.

The relationship to SHAP is tighter. Lundberg and Lee's 2017 paper introduced SHAP as a unifying view grounded in Shapley values and showed that DeepLIFT's propagation rules can be reinterpreted as an efficient approximation of Shapley values when averaged over a reference distribution ^[2]. The resulting algorithm, DeepSHAP, is one of the most widely used deep model explainers in production today. Grad-CAM and SmoothGrad target different problems: Grad-CAM produces coarse spatial heatmaps from CNN feature maps, while SmoothGrad denoises vanilla gradients by averaging over Gaussian-perturbed inputs.

Implementations

The original DeepLIFT implementation is the kundajelab/deeplift repository, written by Avanti Shrikumar and maintained by the Kundaje lab at Stanford ^[3]. Early versions were built on Theano with a Keras frontend; the codebase later moved to Keras-on-TensorFlow. It includes the linear, rescale, and RevealCancel rules along with utilities for reference distributions and integration with TF-MoDISco.

Captum, Meta's open-source PyTorch interpretability library, ships DeepLift and DeepLiftShap classes following the same specification ^[12]. The SHAP library exposes DeepLIFT-derived methods through its DeepExplainer class ^[13], which uses backward hooks to override the gradients of nonlinear layers with DeepLIFT's secant-line slopes and averages attributions over a reference distribution to approximate Shapley values. This is how most practitioners interact with DeepLIFT today, often without realizing the underlying engine is Shrikumar's algorithm.

Because DeepLIFT modifies the backward pass for every nonlinearity, every new layer type needs an explicit propagation rule. Standard libraries cover convolutions, dense layers, ReLU, sigmoid, tanh, max-pooling, and batch normalization. Less common operations such as attention and layer normalization need careful adaptation or fall back to the local gradient.

Applications

Regulatory genomics

DeepLIFT's most consequential application is in regulatory genomics. Convolutional networks trained on DNA sequences can predict whether a sequence will be bound by a transcription factor, accessible to chromatin remodeling, or transcriptionally active. Biologists want to know which subsequences drove a positive call, since those correspond to candidate binding motifs. DeepLIFT contribution scores per nucleotide, averaged over shuffled-sequence references, provide exactly this information.

TF-MoDISco, introduced by Shrikumar and colleagues from 2018 to 2020, builds on DeepLIFT by clustering high-contribution windows ("seqlets") across many input sequences into recurring motif patterns ^[8]. It has been used to rediscover known binding sites and propose new ones from models trained on chromatin accessibility and ChIP-seq data. BPNet, from Avsec and colleagues in 2021, is a base-pair-resolution prediction model that pairs convolutional architectures with DeepLIFT and TF-MoDISco interpretation ^[9], and has become a standard design for sequence-to-profile genomic models.

Other domains

In image classification, DeepLIFT is a baseline but has been largely displaced by Integrated Gradients, Grad-CAM, and Shapley variants. In NLP, attention rollout, Integrated Gradients, and attribution-patching have won out for transformer architectures. In healthcare and tabular prediction, DeepLIFT and DeepSHAP remain common choices for explaining feedforward and recurrent models, especially when regulators expect a Shapley-style decomposition.

Variants

DeepSHAP / DeepLiftShap. The most widely used variant. Combines DeepLIFT's propagation with averaging over a reference distribution to approximate Shapley values ^[2].
Expected DeepLIFT. DeepLIFT averaged over a stochastic baseline distribution, closely related to Erion et al.'s Expected Gradients ^[7].
TF-MoDISco. A downstream tool that clusters DeepLIFT contribution tracks into motif patterns; it forms the standard interpretation stack for sequence models in regulatory genomics ^[8].
Recurrent variants. DeepLIFT rules have been adapted for recurrent networks by unrolling the recurrence and treating each timestep's hidden state as a separate layer.
Transformer extensions. Some authors have adapted DeepLIFT-style propagation to transformer blocks by treating attention as a soft routing operation. These adaptations are less standardized and have not displaced attention-based or path-integral methods.

Limitations

Reference sensitivity. Different baselines produce different attributions, sometimes dramatically. There is no universal recipe for choosing a reference, and the choice encodes the analyst's prior about what "absence of signal" means.

Implementation overhead. Because DeepLIFT modifies the backward pass for every nonlinearity, it requires per-layer propagation rules. Architectures with custom activations, attention, batch normalization, or residual connections need explicit rule choices.

Sanity check failures. Adebayo et al. 2018 ran sanity checks comparing attribution maps before and after randomizing model weights ^[10]. Several attribution methods, including some configurations of DeepLIFT and Integrated Gradients, produced visually similar maps for trained and randomly initialized networks under certain baseline choices. The result has not invalidated DeepLIFT in well-defined domains such as genomics, where attributions are validated against known biological motifs, but it sharpened community caution about overinterpreting saliency outputs.

Less popular than IG in modern interpretability. Integrated Gradients has overtaken DeepLIFT in many vision and NLP benchmarks, partly because IG ships with most major frameworks as a one-line call. DeepLIFT remains dominant in genomics and in DeepSHAP-based production pipelines.

Reception and historical context

The ICML 2017 DeepLIFT paper has accumulated several thousand citations and is a foundational reference for explainable AI in deep learning. Its influence runs along three lines: it set the template for difference-from-reference attribution that SHAP later generalized to a Shapley-value framework; it became the standard interpretation tool in regulatory genomics through TF-MoDISco and BPNet; and it sits alongside Integrated Gradients and LRP in nearly every survey of attribution methods. Outside genomics, the footprint shows up most strongly through SHAP's DeepExplainer.

The year 2017 was a watershed for attribution. Integrated Gradients appeared in March ^[6], the revised DeepLIFT paper in April ^[1] (building on a 2016 preprint, "Not Just a Black Box" ^[11]), and SHAP later that year ^[2], retrospectively casting DeepLIFT and Integrated Gradients as approximations of Shapley values. The Kundaje lab was a fertile environment for DeepLIFT because it sits at the intersection of deep learning and computational biology, and its work on convolutional models for chromatin accessibility and transcription-factor binding created a constant need for interpretation tools that could surface biologically meaningful patterns.

References

Shrikumar, A., Greenside, P., & Kundaje, A. (2017). Learning Important Features Through Propagating Activation Differences. *ICML 2017*. arXiv:1704.02685. https://arxiv.org/abs/1704.02685
Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. *NeurIPS 2017*. arXiv:1705.07874. https://arxiv.org/abs/1705.07874
Kundaje Lab. DeepLIFT reference implementation. GitHub. https://github.com/kundajelab/deeplift
Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Deep Inside Convolutional Networks. *ICLR Workshop*. arXiv:1312.6034. https://arxiv.org/abs/1312.6034
Bach, S., Binder, A., Montavon, G., Klauschen, F., Muller, K.-R., & Samek, W. (2015). On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. *PLOS ONE*, 10(7), e0130140. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140
Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic Attribution for Deep Networks. *ICML 2017*. arXiv:1703.01365. https://arxiv.org/abs/1703.01365
Erion, G., Janizek, J. D., Sturmfels, P., Lundberg, S. M., & Lee, S.-I. (2021). Improving performance of deep learning models with axiomatic attribution priors and expected gradients. *Nature Machine Intelligence*, 3, 620-631. https://arxiv.org/abs/1906.10670
Shrikumar, A., Tian, K., Avsec, Z., Shcherbina, A., Banerjee, A., Sharmin, M., Nair, S., & Kundaje, A. (2020). Technical Note on TF-MoDISco. arXiv:1811.00416. https://arxiv.org/abs/1811.00416
Avsec, Z., Weilert, M., Shrikumar, A., et al. (2021). Base-resolution models of transcription-factor binding reveal soft motif syntax. *Nature Genetics*, 53, 354-366. https://www.nature.com/articles/s41588-021-00782-6
Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., & Kim, B. (2018). Sanity Checks for Saliency Maps. *NeurIPS 2018*. arXiv:1810.03292. https://arxiv.org/abs/1810.03292
Shrikumar, A., Greenside, P., Shcherbina, A., & Kundaje, A. (2016). Not Just a Black Box: Learning Important Features Through Propagating Activation Differences. arXiv:1605.01713. https://arxiv.org/abs/1605.01713
Captum: Model Interpretability for PyTorch. Meta AI. https://captum.ai/
SHAP (SHapley Additive exPlanations) Library. GitHub. https://github.com/shap/shap
Smilkov, D., Thorat, N., Kim, B., Viegas, F., & Wattenberg, M. (2017). SmoothGrad: removing noise by adding noise. arXiv:1706.03825. https://arxiv.org/abs/1706.03825
Selvaraju, R. R., et al. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. *ICCV 2017*. arXiv:1610.02391. https://arxiv.org/abs/1610.02391

DeepLIFT

Background

Core idea

Propagation rules

Linear rule

Rescale rule

RevealCancel rule

Properties

Choice of reference

Implementations

Applications

Regulatory genomics

Other domains

Variants

Limitations

Reception and historical context

See also

References

Improve this article

Background

Core idea

Propagation rules

Linear rule

Rescale rule

RevealCancel rule

Properties

Choice of reference

Implementations

Applications

Regulatory genomics

Other domains

Variants

Limitations

Reception and historical context

See also

References

Background

Core idea

Propagation rules

Linear rule

Rescale rule

RevealCancel rule

Properties

Choice of reference

Comparison with related methods

Implementations

Applications

Regulatory genomics

Other domains

Variants

Limitations

Reception and historical context

See also

References

Improve this article

Related Articles

Sparse autoencoder

Integrated Gradients

Layer-wise Relevance Propagation (LRP)

ARC-AGI 2

Explainable AI

SHAP (SHapley Additive exPlanations)

Background

Core idea

Propagation rules

Linear rule

Rescale rule

RevealCancel rule

Properties

Choice of reference

Comparison with related methods

Implementations

Applications

Regulatory genomics

Other domains

Variants

Limitations

Reception and historical context

See also

References

Related Articles

Sparse autoencoder

Integrated Gradients

Layer-wise Relevance Propagation (LRP)

ARC-AGI 2

Explainable AI

SHAP (SHapley Additive exPlanations)