DeepLIFT (Deep Learning Important FeaTures) is a feature attribution method for deep neural networks introduced by Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje at Stanford University in 2017 [1]. It assigns each input feature a contribution score by comparing every neuron's activation to a reference activation from a baseline input, then propagating these "differences from reference" backward through the network using modified chain rules. DeepLIFT addresses two failure modes of gradient-based saliency methods, namely gradient saturation and the discontinuities introduced by piecewise-linear activations, and it became a core influence on the SHAP library's DeepSHAP variant and on the broader interpretability literature [2].
DeepLIFT belongs to a 2017 wave of attribution methods that put deep network interpretability on more principled footing. Its single-backward-pass formulation makes it efficient compared to path-integral methods such as Integrated Gradients, and its reference baseline made it especially well suited to genomics and other domains where a "null" input has clear scientific meaning. The method was presented at ICML 2017 with the reference implementation hosted in the Kundaje lab GitHub repository [3].
Before DeepLIFT, the dominant tools for explaining predictions of deep convolutional neural networks were gradient-based saliency techniques. Simonyan, Vedaldi, and Zisserman introduced the saliency map in 2014 by visualizing the gradient of a class score with respect to input pixels [4]. The intuition is reasonable: if a small perturbation to a pixel changes the output a lot, that pixel is locally important. Vanilla gradients have two well-documented problems.
The first is gradient saturation. Activation functions including sigmoid, tanh, and ReLU on the negative side have flat regions where the local gradient is near zero. A firmly saturated neuron still carries information about the prediction, yet its gradient assigns no importance to the inputs driving it. The second is thresholding discontinuities. Piecewise-linear functions such as ReLU produce gradients that flip abruptly at zero, causing attribution scores to swing wildly between nearly identical inputs.
A different lineage, exemplified by Layer-wise Relevance Propagation (LRP) from Bach et al. 2015, treated attribution as a backward flow of conserved relevance through the network [5]. LRP's per-layer rules yielded more faithful explanations than raw gradients but lacked a clean link to a baseline-relative contribution. DeepLIFT combined the propagation viewpoint of LRP with an explicit reference activation, giving it a clean conservation property analogous to the completeness axiom Sundararajan, Taly, and Yan formalized for Integrated Gradients in the same year [6].
Let f be a neural network and let x be an input whose prediction we want to explain. DeepLIFT requires the analyst to choose a reference input x', also called the baseline, meant to encode the absence of meaningful signal: an all-zero image, the training set mean, an all-padding token sequence, or a shuffled DNA sequence.
For any neuron t, write t(x) for its activation on the input and t(x') for its activation on the reference. The activation difference is
Delta_t = t(x) - t(x').
DeepLIFT's central object is the contribution score C(Delta_x_i, Delta_t), representing how much the input feature change Delta_x_i = x_i - x'_i contributes to the change in target Delta_t. The defining property is summation-to-delta, also called completeness:
sum_i C(Delta_x_i, Delta_t) = Delta_t.
Contribution scores assigned to input features add up exactly to the change in output activation between input and reference. This is the same conservation idea LRP imposes by construction and that Integrated Gradients proves as an axiom.
Closely related is the multiplier
m(Delta_x, Delta_t) = C(Delta_x, Delta_t) / Delta_x,
the difference-from-reference analog of a partial derivative. Where the gradient measures partial t / partial x at a single point, the multiplier measures the average rate of change of t along the displacement from x' to x. Because it is defined over a finite difference, it does not vanish in saturated regions. DeepLIFT exploits a discrete chain rule
m(Delta_x, Delta_y) = sum_h m(Delta_x, Delta_h) * m(Delta_h, Delta_y)
to propagate multipliers backward through hidden units h. A single backward pass recovers contribution scores for every input feature, which makes DeepLIFT cheap relative to path-integral methods [1].
DeepLIFT specifies three propagation rules that together cover layer types in standard convolutional and feedforward architectures.
For linear layers, including fully connected layers and convolutions, the contribution of input difference Delta_x_i to pre-activation difference Delta_z = sum_j w_{ij} * Delta_x_j is exactly w_{ij} * Delta_x_j, and the multiplier is simply the weight w_{ij}. Linearity guarantees both completeness and a unique decomposition, so the linear rule reduces DeepLIFT to ordinary backpropagation along weight matrices. Some variants distinguish positive and negative parts of Delta_x so activation suppression can be tracked alongside activation increase.
The rescale rule applies to elementwise nonlinear activations such as ReLU, sigmoid, tanh, and softmax. For a nonlinearity y = g(x), the multiplier is
m(Delta_x, Delta_y) = Delta_y / Delta_x,
the slope of the secant line through (x', g(x')) and (x, g(x)). This finite-difference slope is well defined whenever Delta_x is nonzero, and it is precisely the quantity vanilla gradients fail to capture in saturated regions. When Delta_x is zero, the rule falls back to the local derivative g'(x').
The rescale rule treats positive and negative input contributions symmetrically, which can hide cancellations. Consider a ReLU whose input is the sum of a strong positive contribution +5 and a strong negative contribution -4. The net input difference is +1, and rescale attributes the contribution to the net flow. RevealCancel instead splits Delta_x into positive and negative parts and propagates each through the activation independently, so apparent insensitivity to a feature can be revealed as a cancellation of two competing effects. This matters most for networks with large positive-negative interactions and for genomics models with long-range additive structure [1].
DeepLIFT was designed to satisfy a small set of properties the authors argue are minimal requirements for a faithful attribution method.
Completeness (summation-to-delta). Contribution scores sum exactly to the change in output activation between input and reference. This is the same property that Sundararajan, Taly, and Yan proved for Integrated Gradients [6] and that LRP enforces by construction.
Sensitivity. If the network's output differs between x and x' only because of a single feature, that feature should receive nonzero contribution. Vanilla gradients can fail this in saturated regions, while DeepLIFT's finite-difference formulation preserves it.
Implementation invariance. Two networks computing the same function produce the same attributions. DeepLIFT inherits this property from the chain-rule structure of its multipliers.
Efficiency. A single backward pass computes attributions, in contrast to path-integral methods that require dozens to hundreds of forward evaluations.
Reference dependence. DeepLIFT scores are explicitly relative to a baseline. This is a feature when the analyst has a clear notion of "absence," and a problem when no obvious baseline exists.
The reference is the most consequential choice in a DeepLIFT analysis. Because scores are defined as differences from baseline, the same model can produce very different attributions depending on what the analyst calls "absence" of signal.
DeepSHAP, where the reference distribution is chosen so averaged scores approximate Shapley values.[MASK] tokens often serve as natural baselines. In genomics, dinucleotide-shuffled sequences preserve local composition while destroying motif structure, providing a meaningful null. The Kundaje lab's tooling, including TF-MoDISco, averages attributions over many shuffled references to stabilize motif discovery [8].| Method | Year | Core mechanism | Baseline | Cost | Saturation |
|---|---|---|---|---|---|
| Saliency map | 2014 | Vanilla gradient at input | No | One backward pass | Fails |
| LRP | 2015 | Conservation rules per layer | Implicit | One backward pass | Partial |
| LIME | 2016 | Local surrogate model | Perturbations | Many forward passes | Indirect |
| DeepLIFT | 2017 | Multipliers vs reference | Yes, explicit | One backward pass | Robust |
| Integrated Gradients | 2017 | Path integral of gradients | Yes, explicit | Many forward passes | Robust |
| SHAP / DeepSHAP | 2017 | Shapley values, DeepLIFT-based | Reference distribution | Many backward passes | Robust |
| Grad-CAM | 2017 | Weighted feature-map gradients | No | One backward pass | Spatial only |
| SmoothGrad | 2017 | Averaged gradients over noisy inputs | No | Many forward passes | Indirect |
The closest relative is Integrated Gradients. Both satisfy completeness, require an explicit baseline, and produce attributions for arbitrary differentiable models. The practical difference is computational: Integrated Gradients integrates along a straight-line path from baseline to input, typically requiring fifty to two hundred forward and backward evaluations, while DeepLIFT replaces the path integral with a single chain-rule pass using secant-line slopes. The two agree exactly in linear models and closely on many nonlinear architectures, especially when DeepLIFT is averaged over a small reference distribution.
The relationship to SHAP is tighter. Lundberg and Lee's 2017 paper introduced SHAP as a unifying view grounded in Shapley values and showed that DeepLIFT's propagation rules can be reinterpreted as an efficient approximation of Shapley values when averaged over a reference distribution [2]. The resulting algorithm, DeepSHAP, is one of the most widely used deep model explainers in production today. Grad-CAM and SmoothGrad target different problems: Grad-CAM produces coarse spatial heatmaps from CNN feature maps, while SmoothGrad denoises vanilla gradients by averaging over Gaussian-perturbed inputs.
The original DeepLIFT implementation is the kundajelab/deeplift repository, written by Avanti Shrikumar and maintained by the Kundaje lab at Stanford [3]. Early versions were built on Theano with a Keras frontend; the codebase later moved to Keras-on-TensorFlow. It includes the linear, rescale, and RevealCancel rules along with utilities for reference distributions and integration with TF-MoDISco.
Captum, Meta's open-source PyTorch interpretability library, ships DeepLift and DeepLiftShap classes following the same specification [12]. The SHAP library exposes DeepLIFT-derived methods through its DeepExplainer class [13], which uses backward hooks to override the gradients of nonlinear layers with DeepLIFT's secant-line slopes and averages attributions over a reference distribution to approximate Shapley values. This is how most practitioners interact with DeepLIFT today, often without realizing the underlying engine is Shrikumar's algorithm.
Because DeepLIFT modifies the backward pass for every nonlinearity, every new layer type needs an explicit propagation rule. Standard libraries cover convolutions, dense layers, ReLU, sigmoid, tanh, max-pooling, and batch normalization. Less common operations such as attention and layer normalization need careful adaptation or fall back to the local gradient.
DeepLIFT's most consequential application is in regulatory genomics. Convolutional networks trained on DNA sequences can predict whether a sequence will be bound by a transcription factor, accessible to chromatin remodeling, or transcriptionally active. Biologists want to know which subsequences drove a positive call, since those correspond to candidate binding motifs. DeepLIFT contribution scores per nucleotide, averaged over shuffled-sequence references, provide exactly this information.
TF-MoDISco, introduced by Shrikumar and colleagues from 2018 to 2020, builds on DeepLIFT by clustering high-contribution windows ("seqlets") across many input sequences into recurring motif patterns [8]. It has been used to rediscover known binding sites and propose new ones from models trained on chromatin accessibility and ChIP-seq data. BPNet, from Avsec and colleagues in 2021, is a base-pair-resolution prediction model that pairs convolutional architectures with DeepLIFT and TF-MoDISco interpretation [9], and has become a standard design for sequence-to-profile genomic models.
In image classification, DeepLIFT is a baseline but has been largely displaced by Integrated Gradients, Grad-CAM, and Shapley variants. In NLP, attention rollout, Integrated Gradients, and attribution-patching have won out for transformer architectures. In healthcare and tabular prediction, DeepLIFT and DeepSHAP remain common choices for explaining feedforward and recurrent models, especially when regulators expect a Shapley-style decomposition.
Reference sensitivity. Different baselines produce different attributions, sometimes dramatically. There is no universal recipe for choosing a reference, and the choice encodes the analyst's prior about what "absence of signal" means.
Implementation overhead. Because DeepLIFT modifies the backward pass for every nonlinearity, it requires per-layer propagation rules. Architectures with custom activations, attention, batch normalization, or residual connections need explicit rule choices.
Sanity check failures. Adebayo et al. 2018 ran sanity checks comparing attribution maps before and after randomizing model weights [10]. Several attribution methods, including some configurations of DeepLIFT and Integrated Gradients, produced visually similar maps for trained and randomly initialized networks under certain baseline choices. The result has not invalidated DeepLIFT in well-defined domains such as genomics, where attributions are validated against known biological motifs, but it sharpened community caution about overinterpreting saliency outputs.
Less popular than IG in modern interpretability. Integrated Gradients has overtaken DeepLIFT in many vision and NLP benchmarks, partly because IG ships with most major frameworks as a one-line call. DeepLIFT remains dominant in genomics and in DeepSHAP-based production pipelines.
The ICML 2017 DeepLIFT paper has accumulated several thousand citations and is a foundational reference for explainable AI in deep learning. Its influence runs along three lines: it set the template for difference-from-reference attribution that SHAP later generalized to a Shapley-value framework; it became the standard interpretation tool in regulatory genomics through TF-MoDISco and BPNet; and it sits alongside Integrated Gradients and LRP in nearly every survey of attribution methods. Outside genomics, the footprint shows up most strongly through SHAP's DeepExplainer.
The year 2017 was a watershed for attribution. Integrated Gradients appeared in March [6], the revised DeepLIFT paper in April [1] (building on a 2016 preprint, "Not Just a Black Box" [11]), and SHAP later that year [2], retrospectively casting DeepLIFT and Integrated Gradients as approximations of Shapley values. The Kundaje lab was a fertile environment for DeepLIFT because it sits at the intersection of deep learning and computational biology, and its work on convolutional models for chromatin accessibility and transcription-factor binding created a constant need for interpretation tools that could surface biologically meaningful patterns.