SHAP (SHapley Additive exPlanations)

Introduction

SHAP (SHapley Additive exPlanations) is a unified framework for interpreting individual predictions of machine learning models. Developed by Scott Lundberg and Su-In Lee at the University of Washington and first published at NeurIPS 2017, SHAP assigns each input feature a numerical value representing its contribution to a specific prediction. These values are computed using Shapley values, a concept from cooperative game theory introduced by Lloyd Shapley in 1953. SHAP provides a unified approach to interpreting model predictions by connecting several existing explanation methods (including LIME, DeepLIFT, layerwise relevance propagation, and classic Shapley regression values) under a single theoretical framework.

The central question SHAP answers is: for a given prediction, how much did each feature push the prediction away from the average (baseline) prediction? By grounding feature attributions in Shapley values, SHAP is the only additive feature attribution method that simultaneously satisfies three desirable properties: local accuracy, missingness, and consistency. This theoretical guarantee, combined with fast model-specific algorithms (most importantly TreeSHAP for tree ensembles), has made SHAP the de facto standard tool for tabular machine learning interpretability and one of the most widely cited methods in explainable AI.

The open-source Python library shap, which has accumulated more than 25,000 stars on GitHub, provides implementations of several SHAP estimation algorithms (KernelSHAP, TreeSHAP, DeepSHAP, LinearSHAP, PermutationSHAP, and PartitionSHAP) along with a rich set of visualization tools. As of 2026 the package supports Python 3.11 and later and is maintained by a community of contributors after originally being authored by Lundberg.

Explain like I'm 5 (ELI5)

Imagine you and your friends bake a cake together. One friend brought the flour, another brought eggs, and another brought the sugar. Now you want to figure out how much each friend's ingredient helped make the cake taste good. You could try baking the cake without flour to see how much worse it gets, then try without eggs, and so on. But that is not quite fair, because some ingredients work better together. So instead, you try every possible combination of ingredients and figure out, on average, how much each one helped. That average contribution is what SHAP calculates for each feature in a machine learning prediction.

History

From Shapley to machine learning

Lloyd Shapley introduced the Shapley value in 1953 as part of his doctoral dissertation at Princeton University, titled "Additive and Non-additive Set Functions." The work appeared in Contributions to the Theory of Games, Volume II. Shapley's goal was to answer a fundamental question in cooperative game theory: when a group of players work together and generate some total payoff, how should that payoff be fairly divided?

Shapley proved that there is exactly one division scheme satisfying a small set of fairness axioms. For this and related contributions (most notably the Gale-Shapley deferred acceptance algorithm for stable matching), Shapley shared the 2012 Nobel Memorial Prize in Economic Sciences with Alvin Roth.

The idea of using Shapley values to attribute model outputs to inputs predates SHAP. Strumbelj and Kononenko (2010, 2014) used sampling approximations to explain individual predictions. Lipovetsky and Conklin (2001) used Shapley values to decompose R-squared in linear regression. Owen (2014) connected Shapley values to the Sobol' indices from global sensitivity analysis. None of these earlier methods unified the various existing attribution techniques or provided fast algorithms for modern tree ensembles and deep networks.

The SHAP papers

The SHAP framework was introduced by Scott Lundberg and Su-In Lee in a 2017 NeurIPS paper titled "A Unified Approach to Interpreting Model Predictions" (arXiv:1705.07874). The paper introduced three key ideas: a class of additive feature attribution methods that includes LIME, DeepLIFT, classic Shapley regression values, and others; a uniqueness theorem proving that within this class only Shapley values satisfy the three properties of local accuracy, missingness, and consistency; and KernelSHAP, a model-agnostic algorithm for estimating Shapley values via weighted linear regression.

A second key paper, "Consistent Individualized Feature Attribution for Tree Ensembles" by Lundberg, Erion, and Lee (arXiv:1802.03888, 2018), introduced TreeSHAP, an exact polynomial-time algorithm for tree ensembles. The 2020 follow-up "From Local Explanations to Global Understanding with Explainable AI for Trees," published in Nature Machine Intelligence with co-authors including Hugh Chen, Alex DeGrave, Jordan Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, and Nisha Bansal, extended TreeSHAP with interaction values and global aggregation methods and applied SHAP to clinical risk prediction in anesthesia hypoxemia.

The shap Python package was released in 2017 alongside the NeurIPS paper and grew rapidly. By 2025 the original NeurIPS paper had accumulated more than 30,000 citations on Google Scholar, ranking among the most cited machine learning papers of the late 2010s. Lundberg moved from Microsoft Research to industry roles while continuing to maintain the library. Active maintenance later transitioned to the broader open-source community as Lundberg shifted focus.

Shapley values from game theory

Formal definition

In a cooperative game with a set N of n players and a value function v that maps each coalition (subset) S of players to a real-valued payoff v(S), the Shapley value for player i is:

phi_i(v) = sum over all S subset of N{i}: [ |S|! * (n - |S| - 1)! / n! ] * [ v(S union {i}) - v(S) ]

The term v(S union {i}) - v(S) is the marginal contribution of player i to coalition S. The weighting factor |S|! * (n - |S| - 1)! / n! corresponds to the probability that coalition S forms before player i arrives in a uniformly random permutation of all players. In other words, the Shapley value is the average marginal contribution of a player across all possible orderings in which the coalition could have been assembled.

A more compact way of writing the same quantity uses permutations directly. Let pi be a uniformly random permutation of the players, and let Pre(pi, i) denote the set of players appearing before i in pi. Then phi_i = E[v(Pre(pi, i) union {i}) - v(Pre(pi, i))], where the expectation is over uniformly random permutations.

Shapley axioms

Shapley proved that the Shapley value is the unique solution satisfying four axioms:

Axiom	Description
Efficiency	The contributions of all players sum to the total value of the grand coalition: sum of phi_i = v(N) - v(empty set)
Symmetry	If two players contribute equally to every coalition (v(S union {i}) = v(S union {j}) for all S not containing i or j), they receive equal Shapley values
Dummy (Null player)	A player who adds zero marginal contribution to every coalition (v(S union {i}) = v(S) for all S) receives a Shapley value of zero
Additivity (Linearity)	For two games v and w defined on the same player set, the Shapley value of the combined game v + w equals the sum of the individual Shapley values: phi_i(v + w) = phi_i(v) + phi_i(w)

The uniqueness result is striking. Once you accept these four axioms as desirable, the explicit weighted-sum formula is the only possible answer.

Connection to machine learning

In the context of machine learning, the players are input features and the game is a single prediction. The value function v(S) is typically defined as the expected model output when only the features in coalition S are known and the remaining features are marginalized out. The Shapley value for each feature then quantifies its average marginal contribution to the prediction, providing a principled way to distribute credit for the prediction among the input features.

This maps cleanly onto common questions a practitioner asks about a single prediction. "How much did the patient's age push their predicted readmission risk up or down?" becomes "What is the Shapley value of the age feature for this prediction?"

Mathematical formulation of SHAP

Additive feature attribution methods

Lundberg and Lee (2017) defined a class of explanation models called additive feature attribution methods. An explanation model g for an original model f is a linear function of binary variables:

g(z') = phi_0 + sum from j=1 to M of phi_j * z_j'

where z' is a vector in {0, 1}^M (a coalition vector indicating which features are present or absent), M is the number of features, and phi_j is the attribution for feature j. The term phi_0 is the base value, typically equal to the expected model output E[f(X)].

When all features are present (all z_j' = 1), the explanation model produces:

g(x) = phi_0 + sum from j=1 to M of phi_j

which, under the local accuracy property, equals the model's actual prediction f(x).

This class of explanation methods is broader than it first appears. Lundberg and Lee showed that LIME, DeepLIFT, layerwise relevance propagation (LRP), and classic Shapley regression values are all special cases of additive feature attribution methods. The differences lie in how each method computes the phi values.

SHAP properties

Lundberg and Lee proved that within the class of additive feature attribution methods, there is a unique set of values phi_0, phi_1, ..., phi_M satisfying three properties simultaneously.

1. Local accuracy. The sum of feature attributions plus the base value equals the model prediction for the instance being explained:

f(x) = phi_0 + sum from j=1 to M of phi_j

This corresponds to the efficiency axiom from Shapley value theory. It ensures that the explanation fully accounts for the model's output rather than leaving an unexplained residual.

2. Missingness. If a feature is not present in the simplified input (z_j' = 0), its attribution must be zero:

z_j' = 0 implies phi_j = 0

This is a consistency constraint ensuring that features not available for a prediction are not assigned any credit.

3. Consistency. If a model changes so that a feature's marginal contribution increases or stays the same for every possible coalition, then that feature's SHAP value must not decrease. Formally, for two models f and f', if for all coalitions S not containing feature j:

f'(S union {j}) - f'(S) >= f(S union {j}) - f(S)

then phi_j(f') >= phi_j(f).

Consistency ensures that a feature that becomes more influential in a new model is never assigned lower importance. Lundberg and Lee showed that several popular explanation methods (including standard Shapley regression values, LIME, and DeepLIFT) are additive feature attribution methods but do not all satisfy these three properties simultaneously. The Shapley value is the only solution that does.

Relationship to the Shapley value formula

The SHAP values for a model f and instance x are defined as the Shapley values of a conditional expectation game. The value function for a coalition S of features is:

v(S) = E[f(X) | X_S = x_S]

where X_S denotes the subset of features in S, fixed to their values in x, and the expectation is taken over the remaining features according to their conditional distribution. In practice many implementations use the marginal distribution instead of the conditional, treating absent features as drawn independently from a background dataset. The choice between marginal and conditional value functions is one of the most consequential modeling decisions in SHAP, with significant downstream effects on which features receive nonzero attribution. See the discussion of conditional vs. marginal SHAP later in this article.

SHAP variants

The SHAP framework includes several estimation algorithms tailored to different model classes. Each makes different trade-offs between speed, generality, and exactness.

Algorithm	Target model	Approach	Exact or approximate
KernelSHAP	Any (model-agnostic)	Weighted linear regression with SHAP kernel	Approximate
TreeSHAP	Tree ensembles (random forest, XGBoost, LightGBM, CatBoost)	Recursive tree traversal	Exact
DeepSHAP	Neural networks	Modified DeepLIFT backpropagation	Approximate
GradientSHAP	Differentiable models	Expected gradients integration	Approximate
LinearSHAP	Linear models	Closed-form analytical	Exact
PermutationSHAP	Any (model-agnostic)	Antithetic permutation sampling	Approximate (efficiency exact)
PartitionSHAP	Any (model-agnostic)	Hierarchical Owen values over feature partition	Approximate
Sampling SHAP	Any (model-agnostic)	Monte Carlo over permutations	Approximate

KernelSHAP

KernelSHAP is the original model-agnostic algorithm proposed by Lundberg and Lee for estimating SHAP values. It works with any model that can produce predictions but is computationally more expensive than model-specific alternatives.

The algorithm operates in five steps:

Sample coalitions. Generate K random coalition vectors z_k' from {0, 1}^M, where each vector indicates which features are included (1) or excluded (0).
Map to feature space. For each coalition z_k', construct a corresponding input in the original feature space using a mapping function h_x. Features marked as present (1) take their values from the instance x; features marked as absent (0) are replaced with values from randomly sampled data points (approximating the marginal distribution).
Get predictions. Evaluate the model on each mapped input to obtain f(h_x(z_k')).
Weight coalitions. Assign each coalition a weight using the SHAP kernel:

pi_x(z') = (M - 1) / [ C(M, |z'|) * |z'| * (M - |z'|) ]

where C(M, |z'|) is the binomial coefficient and |z'| is the number of features present. This kernel assigns the highest weights to coalitions that are very small or very large, because those provide the most information about individual feature contributions. Coalitions of size 0 and size M are explicitly enforced through constraints rather than sampled.

Fit weighted linear model. Solve a weighted least squares regression:

minimize over phi: sum over z' in Z: [ f(h_x(z')) - g(z') ]^2 * pi_x(z')

The resulting regression coefficients are the estimated SHAP values.

Exact computation of Shapley values requires evaluating 2^M coalitions, which is exponential in the number of features. KernelSHAP approximates the values by sampling a subset of coalitions and fitting the weighted regression, making it tractable for models with many features. However, it remains slow for large-scale applications because each coalition requires a model evaluation and many samples are needed for stable estimates. Variance in the estimates means that running KernelSHAP twice on the same instance can produce slightly different SHAP values.

Connection to LIME

Lundberg and Lee showed that LIME (Local Interpretable Model-agnostic Explanations) by Ribeiro, Singh, and Guestrin (2016) is itself an additive feature attribution method. LIME fits a local linear model around the instance being explained using perturbed samples, but it uses a different kernel function and does not guarantee the Shapley properties. KernelSHAP can be understood as a specific configuration of LIME that uses the SHAP kernel and a particular regularization setup, which yields Shapley values as the unique optimal solution. In effect, SHAP is what you get when LIME is forced to satisfy the Shapley axioms.

TreeSHAP

TreeSHAP is a fast, exact algorithm for computing SHAP values for tree-based models, including decision trees, random forests, gradient boosting machines (XGBoost, LightGBM, CatBoost), and other tree ensemble methods. It was introduced by Lundberg, Erion, and Lee in 2018 and refined in the 2020 Nature Machine Intelligence paper.

TreeSHAP exploits the structure of decision trees to compute SHAP values in polynomial time rather than exponential time. Instead of enumerating all 2^M feature subsets, the algorithm recursively traverses the tree and tracks which feature coalitions lead to which leaf predictions. At each internal node, the algorithm maintains a running account of how many coalitions include or exclude the splitting feature, allowing it to accumulate Shapley values as it walks through the tree.

For an ensemble of trees, the SHAP values are computed independently for each tree and then combined (averaged for random forests, summed for gradient boosting).

TreeSHAP reduces the complexity from O(T * L * 2^M) for exact KernelSHAP applied to a tree ensemble to O(T * L * D^2), where T is the number of trees, L is the maximum number of leaves, and D is the maximum tree depth. Since D is typically much smaller than M (the number of features), this represents a dramatic speedup. For a 1000-tree gradient boosting model with 100 features and depth 6, TreeSHAP can typically compute exact SHAP values for tens of thousands of instances in seconds, while exact KernelSHAP on the same model would be infeasible.

Interventional vs. path-dependent TreeSHAP

TreeSHAP has two variants that differ in how they handle absent features:

Variant	Value function	Feature independence assumption	Axiom compliance
Interventional	E[f(X) \| do(X_S = x_S)] using a background dataset	Treats absent features as independent draws from the background	Satisfies all Shapley axioms including dummy
Path-dependent (conditional)	E[f(X) \| X_S = x_S] estimated from tree structure	Uses path frequencies in the tree as a proxy for the conditional distribution	May violate the dummy axiom for correlated features

The interventional variant computes standard Shapley values by treating absent features as independent of present features. It requires a background dataset and is slower but axiomatically clean. The path-dependent variant is faster (no background dataset required) and follows the tree's branching structure to estimate conditional expectations, but can assign nonzero SHAP values to features the model does not use when those features are correlated with features the model does use.

The SHAP library exposes both via the feature_perturbation argument of TreeExplainer, with values "interventional" and "tree_path_dependent". The choice has been a source of confusion and debate, particularly after Janzing, Minorics, and Bloebaum (2020) argued that interventional Shapley values are the only causally correct version (see Critiques below).

SHAP interaction values

TreeSHAP also supports the computation of SHAP interaction values, which decompose each prediction into main effects and pairwise interaction effects. The interaction value for features i and j is:

phi_ij = sum over S not containing i or j: [ |S|! * (M - |S| - 2)! / (2 * (M - 1)!) ] * delta_ij(S)

where delta_ij(S) = f(S union {i, j}) - f(S union {i}) - f(S union {j}) + f(S). The main effects phi_ii and the interaction effects phi_ij sum to the overall SHAP value for feature i: phi_i = phi_ii + sum over j != i of phi_ij. Interaction values let analysts ask not only "how much did age contribute" but also "how much of age's contribution depended on income."

DeepSHAP

DeepSHAP is an algorithm for estimating SHAP values for deep learning models, including neural networks built with TensorFlow, Keras, and PyTorch. It builds on a connection between SHAP and DeepLIFT, an earlier attribution method by Shrikumar, Greenside, and Kundaje (2017).

DeepLIFT assigns attribution scores by comparing each neuron's activation to a reference activation and propagating contribution scores backward through the network using custom rules. Lundberg and Lee showed that the per-neuron attribution rules in DeepLIFT can be chosen to approximate Shapley values.

DeepSHAP extends DeepLIFT in two ways. First, instead of a single reference input, DeepSHAP uses a distribution of background samples, averaging the DeepLIFT attributions over multiple reference points. This better approximates the expectation over absent features. Second, DeepSHAP uses Shapley equations to linearize nonlinear operations such as max, softmax, and element-wise products, improving the quality of the approximation for complex architectures.

The modified backpropagation rules define multipliers at each layer that relate input differences to output differences, analogous to how gradients flow backward through the network. These multipliers follow a chain rule through the layers, enabling efficient attribution computation in a single backward pass.

DeepSHAP computes approximate, not exact, SHAP values. The approximation quality depends on the network architecture and the choice of background samples. For architectures with complex interactions between layers, the layer-wise decomposition may not perfectly correspond to true Shapley values. KernelSHAP can be used as a model-agnostic alternative for deep learning models when higher accuracy is needed, though at much greater computational cost.

GradientSHAP and Expected Gradients

GradientSHAP estimates SHAP values by integrating gradients along a path from a baseline input to the actual input, similar in spirit to Integrated Gradients (Sundararajan, Taly, and Yan, 2017). Where Integrated Gradients uses a single baseline (often the zero vector), GradientSHAP averages over many baselines drawn from a background distribution, which converts the gradient-based attribution into an estimator of expected SHAP values.

GradientSHAP is faster than KernelSHAP for differentiable models and avoids the path-dependent quirks of DeepSHAP. It is sometimes called "expected gradients" and is implemented in the shap.GradientExplainer class.

LinearSHAP

For a linear model f(x) = beta_0 + sum of beta_j * x_j, the SHAP value for feature j is phi_j = beta_j * (x_j - E[X_j]). No iterative estimation is needed. LinearSHAP is the trivial case of SHAP that recovers a familiar quantity from regression analysis: the contribution of each feature to a prediction relative to its mean. When features are correlated, shap.LinearExplainer offers a conditional variant via feature_perturbation="correlation_dependent" that accounts for the covariance structure.

PermutationSHAP

PermutationSHAP generates random feature orderings, builds coalitions progressively, and estimates marginal contributions by comparing predictions with and without each feature. The method uses antithetic sampling, processing each permutation both forward and backward, to reduce variance. It has the advantage that the SHAP values always sum exactly to f(x) - E[f(X)] regardless of the number of samples, satisfying the efficiency property exactly even with finite samples.

PermutationSHAP is the default model-agnostic explainer in newer versions of the shap library, replacing KernelSHAP for many use cases. It tends to be more sample-efficient and avoids some of the numerical conditioning issues that can arise when fitting the weighted regression in KernelSHAP.

PartitionSHAP

PartitionSHAP computes Owen values over a hierarchical partition of features. The partition can be supplied by the user (for example, grouping pixels of an image into regions) or learned from feature correlations. PartitionSHAP is particularly useful for high-dimensional inputs like images and text, where attributing importance to individual pixels or tokens is less interpretable than attributing it to regions or phrases.

Fast TreeSHAP

Jilei Yang of LinkedIn proposed Fast TreeSHAP in a 2021 paper (arXiv:2109.09847), accelerating TreeSHAP by reducing redundant computations. Fast TreeSHAP v1 achieves a 1.5x speedup at the same memory cost; Fast TreeSHAP v2 achieves a 2.5x speedup at slightly higher memory cost. Both enable parallel multi-core computing and share an API with TreeSHAP in shap. The implementation is open-sourced as the linkedin/FastTreeSHAP repository.

Local vs. global explanations

SHAP values are local in nature: each set of phi values explains a single prediction. Aggregating SHAP values across many predictions yields global summaries. Three common aggregations:

Mean absolute SHAP: average of |phi_j| across all instances. This is the SHAP-based analogue of traditional feature importance ranking.
Mean signed SHAP: average of phi_j across instances. Reveals systematic biases (a feature that consistently pushes predictions up or down).
Distribution of SHAP values: visualized as the beeswarm plot. Shows the full spread of attributions across the dataset, including how the sign of the effect changes with the feature value.

Global summaries derived from SHAP have a useful property absent from many traditional importance measures: they are guaranteed to be consistent. If model A relies on a feature more than model B for every coalition, the mean absolute SHAP for that feature in A is at least as large as in B.

Visualization methods

The SHAP library provides a suite of visualization tools that have become standard in machine learning practice. These visualizations operate at both the individual prediction level (local explanations) and the dataset level (global explanations).

Waterfall plot

The waterfall plot explains a single prediction by showing how each feature's SHAP value pushes the prediction from the base value (expected model output) to the final prediction. It starts at the bottom with phi_0 = E[f(X)] and shows each feature's contribution as a colored bar: red bars push the prediction higher, blue bars push it lower. Features are ordered by the magnitude of their SHAP values. The waterfall structure visually reinforces the additive nature of SHAP, since the positive and negative contributions stack up to produce the final prediction at the top.

Force plot

The force plot presents the same information as the waterfall plot in a more compact, horizontal format. Positive SHAP values (features pushing the prediction up) appear on one side and negative SHAP values (features pushing it down) appear on the other, as if competing against each other. Force plots can also be stacked vertically for multiple instances, creating an interactive visualization that reveals patterns across a dataset. When instances are ordered by prediction value or similarity, the stacked force plot shows how feature effects shift across the data distribution.

Summary plot (beeswarm plot)

The summary plot provides a global view of feature importance and effects. For each feature, it plots every instance's SHAP value as a point along the horizontal axis, with features ordered vertically by mean absolute SHAP value. Points are colored by the feature's actual value (typically red for high values, blue for low). The resulting beeswarm pattern reveals which features matter most (wider horizontal spread), the direction of effect (whether high feature values increase or decrease predictions), and the distribution of effects across the dataset.

The beeswarm plot is among the most cited visualizations in applied machine learning papers. It compresses a large amount of information into a compact figure and quickly answers the questions a reader is most likely to ask.

Bar plot

The bar plot is a simpler global importance summary. It displays the mean absolute SHAP value for each feature, averaged across all instances. This single number per feature measures the average magnitude of each feature's contribution, regardless of direction. It serves as a SHAP-based analog of traditional feature importance rankings but is grounded in a theoretically sound attribution framework.

Dependence plot

The SHAP dependence plot shows the relationship between a feature's value (x-axis) and its SHAP value (y-axis) across all instances. Each dot represents one data point. The plot reveals nonlinear relationships, thresholds, and saturation effects that the model has learned. It can be enhanced by coloring points according to a second feature's value, which highlights interaction effects. For example, if the effect of age on a prediction depends on income, this interaction will appear as distinct color patterns in the dependence plot.

Dependence plots are particularly valuable for detecting model artifacts. A dependence plot that shows wild scatter rather than a coherent trend often indicates feature interactions, data quality issues, or model overfitting on small subsets of the data.

Decision plot

The decision plot shows how a model arrives at a prediction by tracing the cumulative sum of SHAP values from the base value to the final prediction. Features are listed vertically (typically ordered by importance), and the cumulative contribution is traced as a line from bottom to top. When multiple instances are plotted together, crossing or diverging lines reveal where different predictions start to differ, making it useful for comparing groups of predictions or identifying outliers.

Heatmap and clustering plots

The heatmap plot displays SHAP values for many instances at once, with rows for features and columns for instances. When instances are clustered by prediction or SHAP-vector similarity, the heatmap reveals subgroups the model treats differently, which is useful for fairness audits and detecting subpopulations where the model diverges from the global pattern.

Comparison with LIME

SHAP and LIME are two of the most widely used model-agnostic explanation methods. While both explain individual predictions, they differ in theoretical foundation, stability, and scope.

Aspect	SHAP	LIME
Theoretical basis	Shapley values from cooperative game theory	Local surrogate model (weighted linear regression)
Axioms satisfied	Local accuracy, missingness, consistency	No formal guarantees
Stability	Deterministic for exact methods (TreeSHAP, LinearSHAP); low variance for KernelSHAP	Can produce different explanations on repeated runs due to random perturbations
Global interpretability	Yes; SHAP values aggregate naturally to global summaries	Primarily local; global aggregation is ad hoc
Computational cost	Expensive for KernelSHAP; fast for TreeSHAP and LinearSHAP	Generally faster for a single explanation
Model-specific speedups	TreeSHAP, DeepSHAP, LinearSHAP	None; always model-agnostic
Feature interactions	Supported (SHAP interaction values)	Not natively supported
Visualization ecosystem	Extensive built-in visualizations	Limited; typically custom plots
Ease of understanding	Requires understanding of Shapley values	Conceptually simpler (local linear approximation)

In practice, SHAP tends to be preferred when consistency, stability, and theoretical rigor are priorities, especially for tree-based models where TreeSHAP makes computation efficient. LIME may be preferred for quick, one-off explanations where speed matters more than formal guarantees, or when the audience benefits from LIME's more intuitive local-linear framing.

Comparison with permutation feature importance

Permutation feature importance is a global, model-agnostic measure of feature importance. It works by shuffling the values of one feature and measuring how much the model's loss increases. Important features cause large drops in performance when shuffled.

Aspect	SHAP	Permutation importance
Granularity	Per-prediction (local) and aggregable (global)	Global only
Output	Signed contribution for each feature for each prediction	Single nonnegative score per feature
Cost	Fast for tree models (TreeSHAP); expensive for KernelSHAP	One model evaluation per (feature, permutation, instance) tuple
Sensitivity to correlation	Depends on marginal vs. conditional choice	Inflates importance for correlated features (each can be shuffled separately)
Use case	Per-prediction debugging, regulatory explanation, feature interaction analysis	Quick global ranking, feature engineering decisions

Permutation importance is simpler and faster to compute when only a global ranking is needed. SHAP is more nuanced and supports per-prediction explanations, which is essential in regulated domains where individual decisions must be justified.

Comparison with Integrated Gradients

Integrated Gradients (Sundararajan, Taly, and Yan, 2017) is a gradient-based attribution method for differentiable models. It satisfies two axioms (Sensitivity and Implementation Invariance) and is widely used for interpreting deep neural networks, including image classifiers and increasingly large language models.

Aspect	SHAP	Integrated Gradients
Foundation	Cooperative game theory (Shapley values)	Path integral of gradients from baseline to input
Axioms	Local accuracy, missingness, consistency	Sensitivity, implementation invariance
Model class	Any (with model-specific variants)	Differentiable only
Baseline	Background distribution	Single baseline (e.g., zero, blurred image)
Cost	TreeSHAP fast for trees; KernelSHAP slow for general models	One pass over a discretized integration path (typically 20 to 100 forward + backward passes)
Image and language interpretability	Less common; PartitionSHAP for tokens	Standard tool for vision and LLM attribution

For differentiable models, GradientSHAP / Expected Gradients can be viewed as a Shapley-respecting extension of Integrated Gradients that averages over a distribution of baselines. Sundararajan and Najmi (2020) gave a careful axiomatic comparison and introduced "Baseline Shapley" as a third alternative.

Computational complexity

Algorithm	Time	Space	Notes
Exact Shapley	O(2^M * T_f)	O(M)	Exponential in M; infeasible beyond about 20 features
KernelSHAP	O(K * T_f)	O(K * M)	K sampled coalitions; T_f one prediction; K must be large for stability
TreeSHAP	O(T * L * D^2)	O(T * L * D)	T trees, L leaves, D depth
Fast TreeSHAP v1	TreeSHAP / 1.5	same as TreeSHAP	Drops redundant subtree visits
Fast TreeSHAP v2	TreeSHAP / 2.5	higher	Caches subtree summaries
DeepSHAP	O(B * T_bp)	O(model size)	B background samples; T_bp one backward pass
GradientSHAP	O(B * S * T_bp)	O(B + model size)	S integration steps
LinearSHAP	O(M)	O(M)	Analytical
PermutationSHAP	O(P * M * T_f)	O(M)	P permutations (default around 10)

For tree models, TreeSHAP makes SHAP practical at the scale of millions of instances and hundreds of features. For deep learning, DeepSHAP and GradientSHAP balance accuracy and speed. For truly model-agnostic explanations, KernelSHAP or PermutationSHAP are available but slow for large feature sets or expensive models. For very large models like LLMs, even one forward pass is costly, motivating token-level alternatives such as PartitionSHAP or mechanistic interpretability methods.

Applications

SHAP has been adopted across numerous fields where understanding model predictions is important for trust, regulation, or scientific insight.

Healthcare and biomedicine

SHAP is widely used to explain predictions from clinical risk models, diagnostic classifiers, and drug response predictors. Lundberg et al. (2018, 2020) applied TreeSHAP to a gradient-boosted tree model for anesthesia hypoxemia risk, demonstrating how SHAP values could help clinicians understand which patient characteristics (BMI, age, procedure type, time-varying vital signs) contributed to elevated risk during a surgical procedure. The 2020 Nature Machine Intelligence paper showed that SHAP-based monitoring could detect rising hypoxemia risk earlier than the model's raw probability output, because the SHAP decomposition highlights when previously stable features start contributing.

The method has also been applied to predicting hospital readmission, sepsis onset, ICU mortality, disease diagnosis from imaging, drug response prediction, and treatment effect heterogeneity. In each case the per-prediction explanation is what makes SHAP attractive: clinicians want to know not only the risk score but why a particular patient's score is what it is.

Finance and credit scoring

In financial services, regulatory requirements often mandate that automated decisions be explainable. The EU General Data Protection Regulation (GDPR) provides a much-debated "right to explanation" via Article 22, and the U.S. Equal Credit Opportunity Act (ECOA) requires lenders to provide adverse-action notices specifying why a credit application was denied. SHAP provides feature-level explanations for credit scoring models, fraud detection systems, and algorithmic trading strategies. SHAP values help identify which financial indicators (income, debt-to-income ratio, payment history, recent credit inquiries) drive individual credit decisions.

Researchers have built interpretable credit scorecards directly from SHAP values, mapping continuous SHAP contributions to integer point scores reminiscent of traditional credit scoring tables. A 2024 Risks journal study examined SHAP stability in credit risk management and found that SHAP can become significantly less consistent as class imbalance increases, raising concerns about its reliability for severely imbalanced fraud datasets unless carefully validated.

Fraud detection

SHAP is applied in fraud detection pipelines to explain why a particular transaction was flagged as suspicious. Summary plots and force plots help investigators understand which transaction features (amount, time, location, merchant category, device fingerprint, behavioral signals) contributed most to the fraud score, allowing analysts to prioritize investigations and refine detection rules. A 2025 review in Artificial Intelligence Review on model-agnostic explainable AI methods in finance identified SHAP and LIME as the dominant tools, with SHAP preferred when consistency across explanations matters.

Insurance and actuarial work

Insurers use SHAP to explain individual premium calculations and claims decisions. The combination of TreeSHAP with gradient-boosted models is particularly common because gradient boosting dominates insurance modeling and TreeSHAP provides exact attributions in seconds.

Industry adoption

Microsoft has integrated SHAP into its responsible-AI tooling and ran SHAP-based interpretability research within Microsoft Research while Lundberg was on staff. Banks, health insurers, manufacturing firms, logistics companies, and cloud providers have published SHAP case studies. LinkedIn maintains the Fast TreeSHAP open-source project for tree-model deployments at scale.

Natural language processing

In NLP, SHAP can explain text classification, sentiment analysis, and named entity recognition predictions by attributing importance to individual words or tokens. PartitionSHAP and KernelSHAP are typically used because text inputs do not naturally fit the tree or deep learning-specific methods. The SHAP library includes specialized support for text data through its masker and tokenizer classes.

Environmental and earth sciences

SHAP has been used to interpret models predicting air quality, wildfire risk, crop yield, and climate variables. In these domains, understanding feature contributions helps researchers validate that models are learning physically meaningful relationships rather than spurious correlations. SHAP values are also used in causal-inference adjacent workflows where the analyst wants to rank predictors before designing follow-up experiments.

Large language models

SHAP is less commonly used for LLM interpretability than gradient-based methods like Integrated Gradients or mechanistic interpretability tools. The cost of a single LLM forward pass and the high token dimensionality make even PermutationSHAP impractical for explaining a generated response token by token. A 2024 line of research (e.g., Pelaez et al., arXiv:2409.00079) instead uses LLMs to translate raw SHAP outputs into natural-language explanations. SHAP remains the tool of choice when the underlying model is a tabular tree ensemble rather than the LLM itself.

Critiques and theoretical debates

SHAP has been the subject of significant critical literature. The key debates are theoretical, not implementation-level: they concern what Shapley values mean in the context of a learned model and whether they answer the question users actually care about.

Conditional vs. interventional Shapley values

The most important debate concerns the value function v(S). Two natural choices give different SHAP values, and the choice affects whether the dummy axiom holds.

Conditional Shapley uses v(S) = E[f(X) | X_S = x_S], the expected prediction conditioned on the observed values of the present features. This respects the data distribution: the model is only ever evaluated on plausible inputs. But it can violate the dummy axiom. A feature that the model genuinely does not use can still receive a nonzero SHAP value if it is correlated with a feature the model does use.

Interventional Shapley uses v(S) = E[f(X) | do(X_S = x_S)], the expected prediction when we intervene to set the present features and let the absent features vary independently from a background dataset. This satisfies the dummy axiom but evaluates the model on inputs that may be implausible or impossible (tall but very light, or warm but very cold).

Janzing, Minorics, and Bloebaum (AISTATS 2020, "Feature relevance quantification in explainable AI: A causal problem") argued forcefully that the interventional version is the only correct one, framing the choice in causal terms via Pearl's do-calculus. They argued that the SHAP package's defaults at the time, which used path-dependent (conditional-style) attribution for TreeSHAP, were causally incorrect. The SHAP library has since added clearer support for the interventional variant via the feature_perturbation="interventional" option.

Heskes, Sijben, Bucur, and Claassen (NeurIPS 2020, "Causal Shapley Values") proposed a third option: causal Shapley values that respect a known causal graph and separate direct from indirect effects. These require structural assumptions beyond what SHAP normally needs but offer more meaningful attributions when a causal model is available.

The many Shapley values

Sundararajan and Najmi (ICML 2020, "The Many Shapley Values for Model Explanation") catalogued at least three distinct ways to operationalize Shapley values for model explanation: Conditional Expectation Shapley (CES), Interventional Shapley, and Baseline Shapley (BShap), each with different axiomatic properties. They argued that the uniqueness theorem from Shapley (1953) does not apply to model explanation because the choice of value function is a modeling decision, not given by the problem. They advocated for Baseline Shapley as a principled choice with its own uniqueness result and contrasted it with Integrated Gradients.

This paper undercuts a common framing of SHAP as "the unique correct attribution method." The uniqueness is conditional on a particular value function; different value functions give different Shapley values, and the choice involves trade-offs.

Problems with Shapley values as feature importance

Kumar, Venkatasubramanian, Scheidegger, and Friedler (ICML 2020, "Problems with Shapley-value-based explanations as feature importance measures") argued that mathematical problems arise when Shapley values are used as feature importance measures and that fixes to those problems introduce additional complexity, including the need for causal reasoning. Drawing on additional literature, they argued that Shapley values do not provide explanations that suit human-centered goals of explainability. The paper is one of the most cited theoretical critiques of SHAP.

Adversarial manipulation

Slack, Hilgard, Jia, Singh, and Lakkaraju (AIES 2020, "Fooling LIME and SHAP") demonstrated a scaffolding technique that hides the biases of a classifier from post hoc explanation methods. Because both LIME and KernelSHAP rely on perturbed inputs that fall outside the natural data distribution, an adversary can construct a model that behaves one way on real inputs and another way on the perturbed inputs the explainer evaluates. The scaffolded model can be arbitrarily biased (the authors demonstrate a racist classifier built on the COMPAS recidivism dataset) yet the SHAP and LIME explanations look innocuous. The attack does not work against TreeSHAP (which uses the model's own tree structure) but it does work against any model-agnostic perturbation-based method.

This result has practical implications for regulatory use of SHAP: an adversary who knows that SHAP will be used for audit can engineer the model to evade it. Defenses include using interventional variants where possible, validating explanations on held-out manipulated inputs, and combining post hoc explanations with intrinsic interpretability.

Model explanation, not real-world explanation

SHAP values explain how features contribute to a model's prediction, not how features relate to the real-world outcome. A feature can receive a high SHAP value because the model relies on it, even if the feature is a spurious correlate of the true causal mechanism. SHAP does not distinguish between causal and correlational relationships in the underlying data-generating process. This is a feature of the framing (SHAP is about the model, not the world) but is often misunderstood by stakeholders who interpret SHAP rankings as if they were causal effect sizes.

Multicollinearity effects

When features are highly multicollinear, SHAP may assign a large attribution to one of the correlated features and near-zero attribution to the others, even if all are genuinely relevant. This can mislead users into thinking certain features are unimportant when they are actually informative but redundant with another feature. The behavior is a property of the Shapley value (credit must be split among interchangeable players) and not a bug, but practitioners often find it counterintuitive.

Baseline sensitivity

The base value phi_0 (typically E[f(X)]) represents the prediction when no features are known. The choice of background dataset used to estimate this expectation can influence the resulting SHAP values. Different background datasets (the full training set, a subsample, or a specific reference group) can produce different explanations. Best practice is to choose the background to match the question being asked. To explain a prediction "relative to typical patients," use the training set; to explain it "relative to healthy patients," use a healthy subset.

High-dimensional feature spaces

When the number of features is very large (thousands or more), individual SHAP values become difficult to interpret and visualize. PartitionSHAP and Owen-value-based hierarchical approaches address this by attributing importance to groups of features defined by a partition tree, but they introduce additional modeling choices about how to construct the partition.

Stability under class imbalance

A 2024 study in the Risks journal on credit-card default models found that SHAP-based feature rankings became significantly less stable as class imbalance increased, with random initialization alone producing variation in rankings for moderately important features. The takeaway is that SHAP rankings should be validated with bootstrap or cross-validation in domains with severe imbalance.

Software and implementations

The primary implementation of SHAP is the open-source Python package shap, originally authored by Scott Lundberg and now community-maintained. It is available via pip (pip install shap) and conda (conda install -c conda-forge shap). As of 2026 the package supports Python 3.11 and later. Recent releases include 0.49 and 0.50 in late 2025 and 0.51 in early 2026. The package's GitHub repository hosts more than 25,000 stars and remains one of the most active interpretability projects in open source.

SHAP support is also integrated directly into several popular machine learning libraries.

Library	SHAP support
XGBoost	Built-in TreeSHAP via `model.predict(data, pred_contribs=True)`
LightGBM	Built-in TreeSHAP support via `model.predict(data, pred_contrib=True)`
CatBoost	Built-in SHAP value computation via `get_feature_importance(type="ShapValues")`
scikit-learn	Compatible via `shap.TreeExplainer` for tree models, `shap.LinearExplainer` for linear models, `shap.KernelExplainer` for arbitrary estimators
H2O	SHAP support for tree-based H2O models via the Python and R APIs
PiML	Includes SHAP as an explanation module for interpretable ML pipelines
FastTreeSHAP (LinkedIn)	Drop-in replacement for `TreeExplainer` with 1.5x to 2.5x speedups
Microsoft InterpretML	Bundles SHAP alongside EBM and other interpretability tools
Captum (PyTorch)	Implements GradientSHAP and DeepLIFT-SHAP for PyTorch models

In R, the shapviz and fastshap packages provide SHAP value computation and visualization, and the xgboost R package includes native SHAP support. Julia has the ShapML.jl package for Monte Carlo Shapley estimation.

Best practices

A few practical guidelines have emerged from years of SHAP usage in industry. Use TreeSHAP whenever the model is a tree ensemble; there is rarely a reason to use KernelSHAP on a tree model. Prefer the interventional variant of TreeSHAP if the dummy axiom matters or if the explanations will be presented as causal claims. The path-dependent variant is faster but can attribute importance to unused features that are correlated with used ones. Match the background dataset to the question being asked, since the background defines what "typical" means in phi_0. Be skeptical of multicollinear features: when two features carry the same information, SHAP can split the credit arbitrarily between them. Validate stability by running SHAP on bootstrapped or cross-validated samples, especially when class imbalance is severe. Avoid equating SHAP rank with causal importance, since SHAP attributes credit within the model and the model may rely on features that are not causally meaningful. For deep models, prefer GradientSHAP or DeepSHAP over KernelSHAP. For very large models such as LLMs and large transformers, consider mechanistic interpretability or attention-based methods rather than SHAP because the per-instance cost is usually prohibitive.

Key publications

Year	Paper	Contribution
1953	Shapley, "A Value for n-Person Games"	Introduced Shapley values
2010	Strumbelj and Kononenko	Earlier sampling-based Shapley explanations
2014	Owen, "Sobol' Indices and Shapley Value"	Connected Shapley values to sensitivity analysis
2016	Ribeiro, Singh, and Guestrin, "Why Should I Trust You?"	Introduced LIME, which SHAP later unified
2017	Sundararajan, Taly, and Yan	Introduced Integrated Gradients
2017	Shrikumar, Greenside, and Kundaje	Introduced DeepLIFT, foundation for DeepSHAP
2017	Lundberg and Lee, "A Unified Approach to Interpreting Model Predictions"	Introduced SHAP, KernelSHAP, uniqueness theorem
2018	Lundberg, Erion, and Lee	Introduced TreeSHAP
2020	Lundberg et al., Nature Machine Intelligence	Extended TreeSHAP with interaction values and global views
2020	Janzing, Minorics, and Bloebaum	Argued interventional Shapley is causally correct
2020	Sundararajan and Najmi	Catalogued multiple Shapley operationalizations; Baseline Shapley
2020	Kumar et al.	Critique of SHAP as a feature importance measure
2020	Heskes, Sijben, Bucur, Claassen	Introduced causal Shapley values
2020	Slack et al., "Fooling LIME and SHAP"	Adversarial vulnerabilities of explanation methods
2021	Yang, "Fast TreeSHAP"	Accelerated TreeSHAP, LinkedIn open source
2021	Covert, Lundberg, and Lee	Unified framework for removal-based explanations
2024	Pelaez et al.	Used LLMs to translate SHAP outputs into natural language

References

Shapley, L. S. (1953). "A Value for n-Person Games." In *Contributions to the Theory of Games, Volume II* (Annals of Mathematics Studies, 28), pp. 307-317. Princeton University Press.
Lundberg, S. M. and Lee, S.-I. (2017). "A Unified Approach to Interpreting Model Predictions." *Advances in Neural Information Processing Systems (NeurIPS) 30*, pp. 4765-4774. https://arxiv.org/abs/1705.07874
Lundberg, S. M., Erion, G. G., and Lee, S.-I. (2018). "Consistent Individualized Feature Attribution for Tree Ensembles." *arXiv preprint arXiv:1802.03888*. https://arxiv.org/abs/1802.03888
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., and Lee, S.-I. (2020). "From Local Explanations to Global Understanding with Explainable AI for Trees." *Nature Machine Intelligence*, 2, pp. 56-67.
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). "Why Should I Trust You? Explaining the Predictions of Any Classifier." *Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, pp. 1135-1144.
Shrikumar, A., Greenside, P., and Kundaje, A. (2017). "Learning Important Features Through Propagating Activation Differences." *Proceedings of the 34th International Conference on Machine Learning (ICML)*, pp. 3145-3153.
Sundararajan, M., Taly, A., and Yan, Q. (2017). "Axiomatic Attribution for Deep Networks." *Proceedings of the 34th International Conference on Machine Learning (ICML)*, pp. 3319-3328. https://arxiv.org/abs/1703.01365
Janzing, D., Minorics, L., and Bloebaum, P. (2020). "Feature relevance quantification in explainable AI: A causal problem." *Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS)*, pp. 2907-2916. https://arxiv.org/abs/1910.13413
Sundararajan, M. and Najmi, A. (2020). "The Many Shapley Values for Model Explanation." *Proceedings of the 37th International Conference on Machine Learning (ICML)*, pp. 9269-9278. https://arxiv.org/abs/1908.08474
Kumar, I. E., Venkatasubramanian, S., Scheidegger, C., and Friedler, S. (2020). "Problems with Shapley-value-based explanations as feature importance measures." *Proceedings of the 37th International Conference on Machine Learning (ICML)*. https://arxiv.org/abs/2002.11097
Heskes, T., Sijben, E., Bucur, I. G., and Claassen, T. (2020). "Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models." *Advances in Neural Information Processing Systems (NeurIPS)*. https://arxiv.org/abs/2011.01625
Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H. (2020). "Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods." *Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES)*, pp. 180-186. https://arxiv.org/abs/1911.02508
Yang, J. (2021). "Fast TreeSHAP: Accelerating SHAP Value Computation for Trees." *NeurIPS XAI4Debugging Workshop*. arXiv:2109.09847. https://arxiv.org/abs/2109.09847
Covert, I. C., Lundberg, S. M., and Lee, S.-I. (2021). "Explaining by Removing: A Unified Framework for Model Explanation." *Journal of Machine Learning Research*, 22(209), pp. 1-90.
Owen, A. B. (2014). "Sobol' Indices and Shapley Value." *SIAM/ASA Journal on Uncertainty Quantification*, 2(1), pp. 245-251.
Strumbelj, E. and Kononenko, I. (2014). "Explaining prediction models and individual predictions with feature contributions." *Knowledge and Information Systems*, 41, pp. 647-665.
Lipovetsky, S. and Conklin, M. (2001). "Analysis of regression in game theory approach." *Applied Stochastic Models in Business and Industry*, 17, pp. 319-330.
Molnar, C. (2022). *Interpretable Machine Learning: A Guide for Making Black Box Models Explainable*. 2nd edition. https://christophm.github.io/interpretable-ml-book/
Salih, A., Galazzo, I. B., Raisi-Estabragh, Z., Petersen, S. E., Gkontra, P., and Lekadir, K. (2025). "A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME." *Advanced Intelligent Systems*, 7(1), 2400304.
Pelaez, A. and others (2024). "Enhancing the Interpretability of SHAP Values Using Large Language Models." arXiv:2409.00079. https://arxiv.org/abs/2409.00079
SHAP Python package documentation. https://shap.readthedocs.io/
SHAP GitHub repository. https://github.com/shap/shap
LinkedIn FastTreeSHAP repository. https://github.com/linkedin/FastTreeSHAP

Introduction

Explain like I'm 5 (ELI5)

History

From Shapley to machine learning

The SHAP papers

Shapley values from game theory

Formal definition

Shapley axioms

Connection to machine learning

Mathematical formulation of SHAP

Additive feature attribution methods

SHAP properties

Relationship to the Shapley value formula

SHAP variants

KernelSHAP

Connection to LIME

TreeSHAP

Interventional vs. path-dependent TreeSHAP

SHAP interaction values

DeepSHAP

GradientSHAP and Expected Gradients

LinearSHAP

PermutationSHAP

PartitionSHAP

Fast TreeSHAP

Local vs. global explanations

Visualization methods

Waterfall plot

Force plot

Summary plot (beeswarm plot)

Bar plot

Dependence plot

Decision plot

Heatmap and clustering plots

Comparison with related methods

Comparison with LIME

Comparison with permutation feature importance

Comparison with Integrated Gradients

Computational complexity

Applications

Healthcare and biomedicine

Finance and credit scoring

Fraud detection

Insurance and actuarial work

Industry adoption

Natural language processing

Environmental and earth sciences

Large language models

Critiques and theoretical debates

Conditional vs. interventional Shapley values

The many Shapley values

Problems with Shapley values as feature importance

Adversarial manipulation

Model explanation, not real-world explanation

Multicollinearity effects

Baseline sensitivity

High-dimensional feature spaces

Stability under class imbalance

Software and implementations

Best practices

Key publications

References

Improve this article

Related Articles

Explainable AI

LIME

Integrated Gradients

DeepLIFT

Layer-wise Relevance Propagation (LRP)

Permutation variable importances

Introduction

Explain like I'm 5 (ELI5)

History

From Shapley to machine learning

The SHAP papers

Shapley values from game theory

Formal definition

Shapley axioms

Connection to machine learning

Mathematical formulation of SHAP