LIME (Local Interpretable Model-Agnostic Explanations) is a technique for explaining the predictions of any machine learning classifier or regressor by approximating the model locally with an interpretable surrogate model. Introduced by Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin in their 2016 paper "Why Should I Trust You? Explaining the Predictions of Any Classifier," LIME has become one of the most widely used methods in explainable AI. The method works by perturbing the input data around a specific prediction, observing how the model's output changes, and fitting a simple, human-readable model (typically a linear regression) to those perturbations. Because LIME treats the original model as a black box and only requires access to its input-output behavior, it is entirely model-agnostic and can be applied to neural networks, random forests, support vector machines, or any other predictive model.
Imagine you have a magic box that can look at a photo and tell you whether it shows a cat or a dog. You want to know why the box said "cat." So you start covering up different parts of the photo with a blank piece of paper. When you cover the ears and the box suddenly says "dog," you learn that the ears were important for the "cat" answer. That is basically what LIME does. It hides or changes small parts of the input, watches how the answer changes, and then tells you which parts mattered most.
As machine learning models have grown more complex, from deep neural networks with millions of parameters to large ensembles of decision trees, they have become increasingly opaque. A model might achieve 99% accuracy on a test set, yet a practitioner has no way to verify whether the model is making decisions for the right reasons. This lack of transparency creates several practical problems:
Before LIME, most interpretability techniques were either model-specific (e.g., examining the weights of a logistic regression) or provided only global explanations (e.g., feature importance rankings averaged across the entire dataset). LIME filled a gap by providing local, instance-level explanations that work with any model.
The acronym LIME encodes three design goals:
| Property | Meaning |
|---|---|
| Local | Explanations describe the model's behavior in the neighborhood of a single prediction, not globally. A model may behave very differently in different regions of the input space, so local fidelity is more achievable than global fidelity. |
| Interpretable | The explanation itself must be understandable to a human. LIME uses inherently interpretable representations such as binary indicators for the presence or absence of words, superpixels in images, or discretized feature bins in tabular data. |
| Model-agnostic | LIME does not require access to the model's internals (gradients, weights, architecture). It only needs the ability to query the model with arbitrary inputs and observe the outputs. This makes it applicable to any classifier or regressor. |
The LIME algorithm can be broken down into five core steps:
| Data type | Interpretable representation | Perturbation method | Distance metric |
|---|---|---|---|
| Tabular | Discretized feature bins (e.g., quartiles for continuous features, original values for categorical features) | Draw samples from a normal distribution fitted to the training data's per-feature mean and standard deviation | Euclidean distance on the scaled feature vector |
| Text | Binary vector indicating presence or absence of each word | Randomly remove words from the original text | Cosine distance (or 1 minus the fraction of words retained) |
| Image | Binary vector indicating whether each superpixel is active or inactive | Segment the image into superpixels (contiguous regions of similar color), then randomly turn superpixels off by replacing their pixels with a neutral color (e.g., gray) | Cosine distance on the binary superpixel vector |
The LIME objective function balances two goals: the surrogate model should be faithful to the black-box model in the local neighborhood, and the surrogate model should be simple enough for a human to interpret.
Formally, the explanation for an instance x is defined as:
explanation(x) = argmin_{g in G} L(f, g, pi_x) + Omega(g)
Where:
For the common case of sparse linear explanations, the loss function is the locally weighted squared error:
L(f, g, pi_x) = sum_{z, z' in Z} pi_x(z) * (f(z) - g(z'))^2
Here z denotes a perturbed sample in the original feature space, z' is its interpretable representation (binary vector), and the sum runs over all perturbed samples in the generated dataset Z.
The proximity (or weighting) function uses an exponential kernel:
pi_x(z) = exp(-D(x, z)^2 / sigma^2)
where D(x, z) is a distance function (such as Euclidean or cosine distance) between the original instance and the perturbation, and sigma is a kernel width parameter. The default kernel width in the Python implementation is:
sigma = 0.75 * sqrt(number_of_features)
This kernel ensures that perturbations close to the original instance are weighted heavily, while distant perturbations have little influence on the fitted surrogate.
To enforce interpretability, LIME restricts the surrogate model to use at most K features. The implementation uses a variant of LASSO regression with a decreasing regularization parameter. As the regularization strength decreases, features receive non-zero weights one by one until exactly K features have been selected. Alternative feature selection methods include forward selection and highest-weight selection.
While individual LIME explanations are useful for understanding single predictions, practitioners often want to understand the model's overall behavior. SP-LIME addresses this by selecting a small, representative set of instances whose explanations collectively cover the model's decision-making patterns.
A naive approach would be to generate LIME explanations for every instance in the dataset and show them all to a human reviewer, but this is impractical for large datasets. Random sampling might miss important patterns or include redundant explanations. SP-LIME solves this by framing instance selection as a submodular optimization problem.
SP-LIME builds an explanation matrix W of size n x d', where n is the number of instances and d' is the number of interpretable features. Each entry W_ij holds the absolute weight that feature j received in the LIME explanation for instance i:
W_ij = |w_{g_i,j}|
A global importance vector I is then computed, where each entry I_j captures how representative feature j is across all explanations:
I_j = sqrt(sum_{i=1}^{n} W_ij)
The coverage function c measures how much of the global feature importance is covered by a set V of selected instances:
c(V, W, I) = sum_{j=1}^{d'} 1_{[exists i in V : W_ij > 0]} * I_j
The SP-LIME optimization objective is:
Pick(W, I) = argmax_{V, |V| <= B} c(V, W, I)
where B is the budget (maximum number of explanations to show).
Maximizing the coverage function is NP-hard, but because c is a submodular function (adding an element to a larger set yields diminishing returns compared to adding it to a smaller set), a greedy algorithm provides a (1 - 1/e) approximation guarantee, which is approximately 63% of the optimal solution. The greedy algorithm works as follows:
Experiments in the original paper showed that SP-LIME significantly outperformed random selection, especially when the budget B was small. Trust assessments based on SP-LIME-selected explanations were better indicators of the model's true generalization performance.
LIME's original paper demonstrated its value on the 20 Newsgroups text classification task. A random forest classifier distinguishing "Christianity" from "Atheism" posts achieved 92.4% accuracy, but LIME explanations revealed that the classifier relied heavily on email headers (such as "Posting-Host" and "NNTP-Posting-Host") rather than the actual content of the messages. Without LIME, this data leakage would have gone undetected, and the model would have failed in production where such headers are absent.
In natural language processing more broadly, LIME explanations highlight which words or phrases contributed most to a classification decision, helping analysts verify that a sentiment analysis model or spam filter is responding to meaningful textual cues.
For image classification models such as convolutional neural networks, LIME works by segmenting the image into superpixels and identifying which regions are most influential. In the original paper, LIME explained predictions from Google's Inception neural network, revealing, for example, why the model confused an acoustic guitar with an electric guitar: both instruments share a similar fretboard region, and the model focused on that area rather than the body shape.
In medical imaging, LIME has been applied to explain AI-based diagnoses of chest X-rays, retinal scans, and MRI images, helping clinicians verify that a model focuses on clinically relevant regions rather than artifacts.
For structured data (the most common format in industry), LIME explains predictions for credit scoring, fraud detection, customer churn prediction, and insurance underwriting. By showing which features (income, transaction history, account age) contributed to a specific prediction, LIME enables analysts to check for fairness issues and ensure that models comply with regulatory requirements.
LIME has been used extensively in healthcare AI, including applications in Alzheimer's disease detection from MRI scans, cancer diagnosis from histopathology images, and clinical risk prediction from electronic health records. Explanations help clinicians validate whether a model's reasoning aligns with established medical knowledge.
In financial services, LIME explains individual credit decisions, helping lenders provide customers with specific reasons for loan approvals or rejections. It is also used in anti-money laundering systems to explain why a particular transaction was flagged as suspicious.
SHAP (SHapley Additive exPlanations), introduced by Scott Lundberg and Su-In Lee in 2017, is another widely used method for model interpretability. Both LIME and SHAP provide local, model-agnostic explanations, but they differ in their theoretical foundations and practical characteristics.
| Aspect | LIME | SHAP |
|---|---|---|
| Theoretical basis | Locally weighted surrogate model optimization | Shapley values from cooperative game theory |
| Explanation scope | Local only | Both local and global |
| Consistency | Not guaranteed; different runs can produce different explanations due to random perturbation sampling | Satisfies consistency, local accuracy, and missingness axioms from Shapley value theory |
| Speed | Generally faster, especially for high-dimensional data | Can be slow (exponential in the number of features for exact computation), though optimized variants (TreeSHAP, DeepSHAP) exist |
| Feature interactions | Does not capture nonlinear feature interactions (uses a linear surrogate) | Captures interactions through Shapley interaction values |
| Additivity | Feature contributions do not necessarily sum to the prediction | Feature contributions always sum exactly to the difference between the prediction and the expected value |
| Hyperparameter sensitivity | Highly sensitive to kernel width, number of perturbation samples, and number of features K | Fewer hyperparameters; results are more deterministic |
| Model-specific optimizations | None (always model-agnostic) | TreeSHAP (for tree-based models), DeepSHAP (for deep learning), LinearSHAP (for linear models) |
In practice, SHAP is often preferred when theoretical rigor and reproducibility are priorities, while LIME is favored for its simplicity and speed, particularly in exploratory analysis or real-time explanation systems.
Despite its popularity, LIME has several well-documented limitations:
Because LIME relies on random perturbation sampling, running the algorithm multiple times on the same instance can produce different explanations. Research has shown that explanations of two very close points can vary greatly. This lack of determinism undermines trust in the explanations themselves.
The choice of kernel width (sigma) determines how "local" the explanation is, and there is no principled method for selecting this parameter. Different kernel widths can lead to completely different, even contradictory, explanations for the same instance. Christoph Molnar, author of the book Interpretable Machine Learning, calls this "the biggest problem with LIME."
The default perturbation strategy for tabular data samples each feature independently from a normal distribution fitted to the training data. This ignores correlations between features, generating synthetic data points that may be unrealistic or impossible in practice. For example, a perturbation might produce an instance where a person's height is 2 meters but their weight is 30 kilograms.
LIME fits a linear surrogate model, which means it cannot capture nonlinear decision boundaries in the local neighborhood. If the model's behavior around the instance of interest is highly nonlinear, the linear approximation may be misleading.
Research has shown that adversaries can design models that produce biased predictions while generating innocuous LIME explanations. This vulnerability means that LIME explanations alone cannot serve as a guarantee of model fairness or correctness.
Beyond the kernel width, LIME requires the user to specify the number of features K in the explanation, the number of perturbation samples, and the distance metric. There is limited guidance on how to set these parameters, and different settings can yield substantially different explanations.
For high-dimensional data, generating and evaluating a sufficient number of perturbations can become computationally expensive. Each perturbation requires a forward pass through the black-box model, which may itself be costly.
The limitations of LIME have motivated several follow-up methods:
| Method | Year | Key improvement over LIME |
|---|---|---|
| SHAP | 2017 | Provides theoretically grounded Shapley values with consistency guarantees |
| Anchors | 2018 | Proposed by the same authors as LIME (Ribeiro, Singh, Guestrin); replaces linear surrogates with IF-THEN rules that clearly state their coverage boundaries |
| DLIME | 2019 | Uses hierarchical clustering and K-nearest neighbors instead of random perturbation sampling, improving stability |
| BayLIME | 2021 | Incorporates Bayesian reasoning into LIME for uncertainty quantification |
| US-LIME | 2024 | Improves fidelity on tabular data through uncertainty sampling instead of random perturbation |
| BMB-LIME | 2024 | Addresses local nonlinearity and uncertainty by extending the surrogate modeling approach |
Anchors, in particular, was designed as a direct successor. Like LIME, it uses a perturbation-based strategy but produces rule-based explanations (e.g., "IF word 'excellent' is present AND word 'terrible' is absent, THEN positive sentiment"). These rules explicitly state the conditions under which they hold, making them more robust than LIME's linear approximations.
LIME is available in multiple programming languages and frameworks:
| Package | Language | Description |
|---|---|---|
| lime | Python | The original implementation by Marco Tulio Ribeiro. Supports LimeTabularExplainer, LimeTextExplainer, and LimeImageExplainer. Install with pip install lime. |
| lime | R | R port maintained by Thomas Lin Pedersen. Available on CRAN. |
| iml | R | An R package for interpretable machine learning that includes LIME alongside other methods. |
| DALEX | R / Python | A framework for model explanation and exploration that implements LIME among other XAI methods. |
| InterpretML | Python | Microsoft's interpretability toolkit that includes LIME along with SHAP, EBMs, and other methods. |
| eli5 | Python | A library for debugging and explaining ML classifiers that supports LIME-style explanations. |
A minimal example of using LIME with scikit-learn:
import lime
import lime.lime_tabular
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
# Train a model
iris = load_iris()
rf = RandomForestClassifier(n_estimators=100)
rf.fit(iris.data, iris.target)
# Create a LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
training_data=iris.data,
feature_names=iris.feature_names,
class_names=iris.target_names,
mode='classification'
)
# Explain a single prediction
exp = explainer.explain_instance(
data_row=iris.data<sup><a href="#cite_note-0" class="cite-ref">[0]</a></sup>,
predict_fn=rf.predict_proba,
num_features=4
)
# View the explanation
exp.show_in_notebook()
The original LIME paper, "Why Should I Trust You? Explaining the Predictions of Any Classifier," was presented at the ACM Conference on Knowledge Discovery and Data Mining (KDD) in 2016 and has since accumulated over 15,000 citations on Google Scholar as of 2025, making it one of the most cited papers in the explainable AI literature. LIME helped establish the field of post-hoc model interpretability as a mainstream area of machine learning research and has influenced regulatory thinking about algorithmic transparency.
The Python lime package on GitHub (github.com/marcotcr/lime) has over 11,000 stars and remains actively maintained, with support for text, image, and tabular data explanations.