SHAP and LIME: Complete Guide to Explainable AI

Introduction to Model Interpretability

As machine learning models become more complex and are deployed in critical applications like healthcare, finance, and criminal justice, understanding why a model makes specific predictions becomes as important as the predictions themselves. This is where Explainable AI (XAI) comes in.

SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are the two most popular frameworks for explaining black-box machine learning models. They help answer questions like:

Why did the model predict this patient has high risk?
Which features contributed most to this loan being denied?
How reliable are the model's decisions?
Are there biases in the model's predictions?

Why Model Interpretability Matters

Trust and adoption: Stakeholders need to trust AI decisions before adopting them
Debugging: Understand why models fail and improve them
Regulatory compliance: EU GDPR and other regulations require explainable decisions
Fairness and bias detection: Identify and mitigate discriminatory patterns
Domain knowledge validation: Verify that the model learns sensible patterns
Feature engineering: Discover which features matter most
Model comparison: Compare different models beyond just accuracy metrics

SHAP: SHapley Additive exPlanations

SHAP is based on Shapley values from cooperative game theory. It assigns each feature an importance value for a particular prediction, showing how much each feature contributes to pushing the prediction away from the base value (average prediction).

Key Concepts

Shapley values: Fair allocation of contribution from game theory
Base value: Average prediction across all training data
SHAP value: How much a feature changes the prediction from the base value
Model-agnostic: Works with any machine learning model
Additive: SHAP values sum to the difference between base and prediction

Installation and Setup

# Install SHAP
pip install shap

# Also install required dependencies
pip install matplotlib numpy pandas scikit-learn

Basic SHAP Example

import shap
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

# Load data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# Split and train model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Create SHAP explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Explain a single prediction
print("Base value (average prediction):", explainer.expected_value[1])
print("Prediction for first test sample:", model.predict_proba(X_test.iloc[[0]])[0])
print("\nTop 5 features for first prediction:")
feature_importance = pd.DataFrame({
    'feature': X_test.columns,
    'shap_value': shap_values[1][0]
}).sort_values('shap_value', key=abs, ascending=False).head()
print(feature_importance)

SHAP Visualizations

import shap
import matplotlib.pyplot as plt

# 1. Force Plot - Explain single prediction
# Shows how each feature pushes prediction from base value
shap.force_plot(
    explainer.expected_value[1],
    shap_values[1][0],
    X_test.iloc[0],
    matplotlib=True
)
plt.savefig('shap_force_plot.png', bbox_inches='tight', dpi=150)

# 2. Waterfall Plot - Alternative single prediction view
shap.waterfall_plot(
    shap.Explanation(
        values=shap_values[1][0],
        base_values=explainer.expected_value[1],
        data=X_test.iloc[0],
        feature_names=X_test.columns.tolist()
    )
)

# 3. Summary Plot - Feature importance across all predictions
shap.summary_plot(shap_values[1], X_test, plot_type="bar")
plt.title("Global Feature Importance")
plt.tight_layout()
plt.savefig('shap_summary_bar.png', dpi=150)

# 4. Beeswarm Plot - Shows feature values and impact
shap.summary_plot(shap_values[1], X_test)
plt.title("SHAP Summary Plot")
plt.savefig('shap_beeswarm.png', bbox_inches='tight', dpi=150)

# 5. Dependence Plot - How feature value affects prediction
shap.dependence_plot(
    "mean radius",  # Feature to analyze
    shap_values[1],
    X_test,
    interaction_index="mean texture"  # Color by interaction
)
plt.savefig('shap_dependence.png', bbox_inches='tight', dpi=150)

LIME: Local Interpretable Model-agnostic Explanations

LIME explains individual predictions by fitting a simple, interpretable model (like linear regression) locally around the prediction. It perturbs the input data and observes how predictions change, then learns a simple model to approximate the complex model's behavior in that local region.

How LIME Works

Select an instance to explain
Generate perturbed samples around this instance
Get predictions for these perturbed samples from the black-box model
Fit a simple, interpretable model (like linear regression) on this local data
Use the simple model's coefficients to explain the prediction

Installation

# Install LIME
pip install lime

LIME for Tabular Data

import lime
import lime.lime_tabular
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer

# Load and prepare data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)

# Create LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=np.array(X),
    feature_names=X.columns,
    class_names=['malignant', 'benign'],
    mode='classification'
)

# Explain a prediction
idx = 0
explanation = explainer.explain_instance(
    data_row=X.iloc[idx].values,
    predict_fn=model.predict_proba,
    num_features=10
)

# Display explanation
print("Prediction:", model.predict_proba(X.iloc[[idx]])[0])
print("\nFeature contributions:")
print(explanation.as_list())

# Visualize explanation
explanation.show_in_notebook()
# Or save to file
explanation.save_to_file('lime_explanation.html')

LIME for Text Classification

import lime
from lime.lime_text import LimeTextExplainer
from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

# Sample text data
texts = [
    "This movie is fantastic! Great acting and plot.",
    "Terrible film. Waste of time and money.",
    "Amazing cinematography and storytelling.",
    "Boring and predictable. Very disappointing."
]
labels = [1, 0, 1, 0]  # 1 = positive, 0 = negative

# Create and train pipeline
pipeline = make_pipeline(
    TfidfVectorizer(),
    LogisticRegression()
)
pipeline.fit(texts, labels)

# Create LIME explainer for text
explainer = LimeTextExplainer(class_names=['negative', 'positive'])

# Explain a prediction
text = "This film is absolutely wonderful and entertaining!"
explanation = explainer.explain_instance(
    text,
    pipeline.predict_proba,
    num_features=6
)

# Show which words contributed to the prediction
print("Prediction probabilities:", pipeline.predict_proba([text])[0])
print("\nWord contributions:")
for word, weight in explanation.as_list():
    print(f"  {word}: {weight:.4f}")

# Visualize
explanation.show_in_notebook()
explanation.save_to_file('lime_text_explanation.html')

Comparing SHAP and LIME

SHAP Advantages

Theoretical foundation: Based on solid game theory principles
Consistency: Always gives the same explanation for the same input
Global view: Easy to aggregate explanations across dataset
Fast for tree models: TreeExplainer is very efficient
Additive property: SHAP values sum to the prediction difference

LIME Advantages

Intuitive: Easier to understand conceptually
Flexible: Works well with text and image data
Fast explanations: Quick for individual predictions
Model-agnostic: Works with any black-box model
Local focus: Excellent for explaining specific instances

Side-by-Side Comparison

import shap
import lime.lime_tabular
import numpy as np
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt

# Load data and train model
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
model = GradientBoostingClassifier(random_state=42)
model.fit(X, y)

# Instance to explain
instance_idx = 0
instance = X.iloc[instance_idx]

# SHAP Explanation
shap_explainer = shap.TreeExplainer(model)
shap_values = shap_explainer.shap_values(X.iloc[[instance_idx]])
shap_features = pd.DataFrame({
    'feature': X.columns,
    'shap_value': shap_values[0]
}).sort_values('shap_value', key=abs, ascending=False).head(10)

# LIME Explanation
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=np.array(X),
    feature_names=X.columns.tolist(),
    class_names=['malignant', 'benign'],
    mode='classification'
)
lime_exp = lime_explainer.explain_instance(
    instance.values,
    model.predict_proba,
    num_features=10
)
lime_features = pd.DataFrame(lime_exp.as_list(), columns=['feature', 'lime_value'])

# Compare
print("Model prediction:", model.predict_proba(X.iloc[[instance_idx]])[0])
print("\nTop features by SHAP:")
print(shap_features)
print("\nTop features by LIME:")
print(lime_features)

# Both methods often agree on the most important features
# but may differ in exact importance values

Practical Applications

1. Credit Risk Assessment

import shap
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier

# Simulate credit data
np.random.seed(42)
n_samples = 1000
credit_data = pd.DataFrame({
    'income': np.random.normal(50000, 20000, n_samples),
    'age': np.random.randint(18, 70, n_samples),
    'credit_score': np.random.randint(300, 850, n_samples),
    'debt_to_income': np.random.uniform(0, 1, n_samples),
    'employment_years': np.random.randint(0, 40, n_samples),
    'num_credit_cards': np.random.randint(0, 10, n_samples)
})

# Create target (approved/denied)
credit_data['approved'] = (
    (credit_data['credit_score'] > 650) &
    (credit_data['debt_to_income'] < 0.5) &
    (credit_data['income'] > 30000)
).astype(int)

# Train model
X = credit_data.drop('approved', axis=1)
y = credit_data['approved']
model = GradientBoostingClassifier(random_state=42)
model.fit(X, y)

# Explain a denied application
denied_idx = y[y == 0].index[0]
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X.iloc[[denied_idx]])

print("Application Status: DENIED")
print(f"Probability of approval: {model.predict_proba(X.iloc[[denied_idx]])[0][1]:.2%}")
print("\nFactors contributing to denial:")

explanation_df = pd.DataFrame({
    'Feature': X.columns,
    'Value': X.iloc[denied_idx].values,
    'Impact': shap_values[0]
}).sort_values('Impact', ascending=True)

for _, row in explanation_df.head(5).iterrows():
    direction = "↓ decreases" if row['Impact'] < 0 else "↑ increases"
    print(f"  {row['Feature']}: {row['Value']:.2f} {direction} approval chance")

2. Medical Diagnosis Explanation

import shap
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer

# Load medical data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)

# Explain diagnosis for a patient
patient_idx = 0
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X.iloc[[patient_idx]])

prediction = model.predict(X.iloc[[patient_idx]])[0]
confidence = model.predict_proba(X.iloc[[patient_idx]])[0][prediction]

print(f"Diagnosis: {'Benign' if prediction == 1 else 'Malignant'}")
print(f"Confidence: {confidence:.1%}")
print("\nKey diagnostic factors:")

# Get top contributing features
feature_impact = pd.DataFrame({
    'Feature': X.columns,
    'Value': X.iloc[patient_idx].values,
    'SHAP': shap_values[prediction][0]
}).sort_values('SHAP', key=abs, ascending=False).head(5)

for _, row in feature_impact.iterrows():
    direction = "supports" if row['SHAP'] > 0 else "contradicts"
    print(f"  {row['Feature']}: {row['Value']:.2f} {direction} diagnosis")

# Generate a detailed report
shap.waterfall_plot(
    shap.Explanation(
        values=shap_values[prediction][0],
        base_values=explainer.expected_value[prediction],
        data=X.iloc[patient_idx],
        feature_names=X.columns.tolist()
    )
)
plt.title("Diagnostic Feature Contribution")
plt.savefig('medical_diagnosis_explanation.png', bbox_inches='tight', dpi=150)

3. Bias Detection

import shap
import pandas as pd
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier

# Simulate hiring data with potential bias
np.random.seed(42)
n = 1000
hiring_data = pd.DataFrame({
    'experience_years': np.random.randint(0, 20, n),
    'education_level': np.random.randint(1, 5, n),
    'technical_score': np.random.randint(50, 100, n),
    'age': np.random.randint(22, 65, n),
    'gender': np.random.choice([0, 1], n)  # 0=female, 1=male
})

# Create biased target (unfairly favoring males)
hiring_data['hired'] = (
    (hiring_data['technical_score'] > 70) &
    (hiring_data['experience_years'] > 2) &
    ((hiring_data['gender'] == 1) | (np.random.random(n) > 0.3))  # Bias
).astype(int)

# Train model
X = hiring_data.drop('hired', axis=1)
y = hiring_data['hired']
model = GradientBoostingClassifier(random_state=42)
model.fit(X, y)

# Analyze feature importance
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

# Check if gender has inappropriate influence
print("Feature Importance (Mean Absolute SHAP):")
feature_importance = pd.DataFrame({
    'feature': X.columns,
    'importance': np.abs(shap_values).mean(axis=0)
}).sort_values('importance', ascending=False)
print(feature_importance)

# Investigate gender impact
gender_impact = np.abs(shap_values[:, X.columns.get_loc('gender')]).mean()
print(f"\nGender SHAP importance: {gender_impact:.4f}")
if gender_impact > 0.1:
    print("⚠️  WARNING: Gender appears to have significant influence on hiring decisions!")
    print("   This may indicate bias in the model.")

# Compare predictions for identical candidates with different genders
sample_candidate = X.iloc[0].copy()
sample_candidate['gender'] = 0
pred_female = model.predict_proba([sample_candidate])[0][1]

sample_candidate['gender'] = 1
pred_male = model.predict_proba([sample_candidate])[0][1]

print(f"\nSame candidate, different gender:")
print(f"  Female: {pred_female:.2%} hiring probability")
print(f"  Male: {pred_male:.2%} hiring probability")
print(f"  Difference: {abs(pred_male - pred_female):.2%}")

Advanced Techniques

SHAP Interaction Values

import shap
from sklearn.ensemble import RandomForestClassifier

# Train model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

# Calculate interaction values
explainer = shap.TreeExplainer(model)
shap_interaction_values = explainer.shap_interaction_values(X_test)

# Visualize interaction between two features
shap.dependence_plot(
    ("mean radius", "mean texture"),
    shap_interaction_values[1],
    X_test,
    display_features=X_test
)
plt.title("Feature Interaction: Mean Radius × Mean Texture")
plt.savefig('shap_interaction.png', bbox_inches='tight', dpi=150)

Model Comparison with SHAP

import shap
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression

# Train multiple models
models = {
    'Random Forest': RandomForestClassifier(random_state=42),
    'Gradient Boosting': GradientBoostingClassifier(random_state=42),
    'Logistic Regression': LogisticRegression(random_state=42)
}

for name, model in models.items():
    model.fit(X_train, y_train)

# Compare feature importance across models
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

for idx, (name, model) in enumerate(models.items()):
    if name == 'Logistic Regression':
        # Use KernelExplainer for non-tree models
        explainer = shap.KernelExplainer(
            model.predict_proba,
            shap.sample(X_train, 100)
        )
        shap_values = explainer.shap_values(X_test[:100])
    else:
        explainer = shap.TreeExplainer(model)
        shap_values = explainer.shap_values(X_test)

    plt.sca(axes[idx])
    shap.summary_plot(shap_values[1], X_test, plot_type="bar", show=False)
    plt.title(f"{name}\nFeature Importance")

plt.tight_layout()
plt.savefig('model_comparison.png', dpi=150)

Best Practices

Choose the right tool: Use SHAP for global interpretability, LIME for quick local explanations
Validate explanations: Check if explanations align with domain knowledge
Explain to stakeholders: Tailor visualizations to your audience's technical level
Use TreeExplainer for trees: Much faster than KernelExplainer for tree-based models
Sample for large datasets: Use representative samples for faster computation
Document assumptions: Be clear about what your explanations do and don't show
Test stability: Check if explanations are consistent across similar instances
Combine methods: Use multiple interpretation techniques for comprehensive understanding

Common Pitfalls

Over-interpreting local explanations: LIME explains one instance, not the whole model
Ignoring feature correlation: Correlated features can have unreliable importance values
Not checking stability: Some explanations can be unstable for similar inputs
Confusing correlation with causation: Feature importance ≠ causal relationships
Using wrong explainer: KernelExplainer is slow; use TreeExplainer for tree models
Neglecting computational cost: SHAP can be expensive for large datasets
Trusting explanations blindly: Explanations can be misleading; validate them

Master Explainable AI and Model Interpretation

Our Data Science program covers model interpretation in-depth, from fundamental techniques to advanced explainability methods. Learn to build trustworthy, interpretable AI systems with expert guidance and hands-on projects.

Explore Data Science Program

SHAP and LIME for Model Interpretation

Introduction to Model Interpretability

Why Model Interpretability Matters

SHAP: SHapley Additive exPlanations

Key Concepts

Installation and Setup

Basic SHAP Example

SHAP Visualizations

LIME: Local Interpretable Model-agnostic Explanations

How LIME Works

Installation

LIME for Tabular Data

LIME for Text Classification

Comparing SHAP and LIME

SHAP Advantages

LIME Advantages

Side-by-Side Comparison

Practical Applications

1. Credit Risk Assessment

2. Medical Diagnosis Explanation

3. Bias Detection

Advanced Techniques

SHAP Interaction Values

Model Comparison with SHAP

Best Practices

Common Pitfalls

Master Explainable AI and Model Interpretation

Related Articles

SHAP and LIME for Model Interpretation

Introduction to Model Interpretability

Why Model Interpretability Matters

SHAP: SHapley Additive exPlanations

Key Concepts

Installation and Setup

Basic SHAP Example

SHAP Visualizations

LIME: Local Interpretable Model-agnostic Explanations

How LIME Works

Installation

LIME for Tabular Data

LIME for Text Classification

Comparing SHAP and LIME

SHAP Advantages

LIME Advantages

Side-by-Side Comparison

Practical Applications

1. Credit Risk Assessment

2. Medical Diagnosis Explanation

3. Bias Detection

Advanced Techniques

SHAP Interaction Values

Model Comparison with SHAP

Best Practices

Common Pitfalls

Master Explainable AI and Model Interpretation

Related Articles

Model Evaluation: Metrics and Validation

Feature Engineering for Machine Learning

Machine Learning Fundamentals