IA explicable et transparence 2025 : XAI pour la confiance en production

XAI : Comprendre les décisions IA

L'IA explicable (XAI - Explainable AI) répond à la question critique : "Pourquoi le modèle a-t-il pris cette décision ?". Avec l'entrée en vigueur de l'EU AI Act (2025) et exigences croissantes conformité, l'explicabilité passe de "nice-to-have" à obligatoire pour systèmes IA high-risk (santé, finance, recrutement, justice).

Les modèles complexes (deep learning, ensembles) sont des black boxes : haute performance mais zéro transparence. XAI fournit outils pour interpréter ces modèles sans sacrifier précision.

Pourquoi XAI critique ? :

Régulation : EU AI Act impose explicabilité systèmes high-risk
Trust : Users acceptent mieux décisions IA comprises
Debugging : Identifier biais, erreurs modèles
Fairness : Détecter discriminations (genre, ethnie, âge)
Safety : Vérifier raisonnement modèles critiques (santé, auto)

Techniques XAI :

SHAP (SHapley Additive exPlanations) : Contribution features
LIME (Local Interpretable Model-agnostic Explanations) : Explications locales
Saliency Maps : Visualisation importance pixels (images)
Attention Weights : Transformers (quels tokens modèle regarde)

Exigences réglementaires

L'EU AI Act (effectif 2025) exige que systèmes IA "high-risk" (santé, finance, recrutement, justice, transport) fournissent explications compréhensibles décisions. Non-conformité : amendes jusqu'à €35M ou 7% CA global. 72% entreprises EU investissent XAI pour conformité (IDC 2025).

Techniques XAI principales

SHAP : Contribution features

# SHAP: Explique prédictions modèle ML

import shap
import xgboost as xgb
import pandas as pd

# 1. Train model (XGBoost - black box)
X_train, y_train = load_data("credit_scoring.csv")
# Features: income, age, credit_history, debt_ratio, etc.

model = xgb.XGBClassifier().fit(X_train, y_train)

# 2. SHAP Explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_train)

# 3. Explain single prediction
customer_id = 42
shap.plots.waterfall(shap_values[customer_id])
# Output visualization:
# Base value: 0.32 (average approval rate)
# + credit_history (good): +0.18
# + income (high): +0.12
# - debt_ratio (high): -0.08
# - age (young): -0.03
# = Final: 0.51 (APPROVED)

# 4. Global feature importance
shap.plots.beeswarm(shap_values)
# Shows: credit_history most important feature overall

# 5. Force plot (visual explanation)
shap.plots.force(shap_values[customer_id])
# Interactive HTML: Red = push toward rejection, Blue = approval

Use case : Credit scoring

CUSTOMER APPLICATION:
Name: John Doe
Income: $75,000
Age: 28
Credit history: 3 years (good payment)
Debt ratio: 42% (high)

MODEL PREDICTION: APPROVED (51% confidence)

SHAP EXPLANATION:
Base rate: 32% approval (population average)

Positive factors (+19 points):
├── Credit history good: +18 pts ★ (biggest factor)
├── Income >$70k: +12 pts
└── Employment stable: +6 pts

Negative factors (-17 pts):
├── Debt ratio 42%: -8 pts (concerning)
├── Age 28: -3 pts (limited history)
└── No homeowner: -6 pts

NET: +2 pts → 34% APPROVED

BUSINESS VALUE:
✓ Customer understands why approved (transparency)
✓ Bank can justify decision (audit, regulation)
✓ Identify improvement areas (reduce debt → better terms)

LIME : Explications locales

# LIME: Explain any black-box classifier

from lime import lime_text
from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier

# 1. Train sentiment classifier (black box)
texts_train, labels_train = load_sentiment_data()
# Labels: 0 = negative, 1 = positive

vectorizer = TfidfVectorizer()
classifier = RandomForestClassifier(n_estimators=500)
pipeline = make_pipeline(vectorizer, classifier)
pipeline.fit(texts_train, labels_train)

# 2. LIME Explainer
explainer = lime_text.LimeTextExplainer(class_names=['negative', 'positive'])

# 3. Explain prediction
review = "This movie was terrible! Waste of time. Poor acting."
prediction = pipeline.predict_proba([review])[0]
# Output: [0.89, 0.11] → 89% negative

# Generate explanation
exp = explainer.explain_instance(
    review,
    pipeline.predict_proba,
    num_features=10
)

# Visualization
exp.show_in_notebook()
# Output:
# Predicted: NEGATIVE (89%)
#
# Words contributing to NEGATIVE:
# - "terrible": -0.34 ★★★
# - "waste": -0.21 ★★
# - "poor": -0.18 ★★
#
# Words contributing to POSITIVE:
# - "movie": +0.04 (neutral word, slight positive)

# → Clear: "terrible", "waste", "poor" drove negative prediction

Attention Weights (Transformers)

# Visualize Transformer attention (BERT, GPT)

from transformers import BertTokenizer, BertModel
import torch

# 1. Load BERT
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased', output_attentions=True)

# 2. Input text
text = "The movie was great but the ending was disappointing."
inputs = tokenizer(text, return_tensors="pt")

# 3. Forward pass (get attentions)
outputs = model(**inputs)
attentions = outputs.attentions  # 12 layers, 12 heads each

# 4. Visualize attention (layer 11, head 3)
tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])
attention_weights = attentions[11][0, 3].detach().numpy()  # Layer 11, head 3

# Heatmap:
#           [CLS]  the  movie  was  great  but  the  ending  was  disappointing  [SEP]
# [CLS]     0.02  0.01  0.03  0.01  0.04  0.07  0.02   0.05   0.03     0.12       0.60
# the       0.01  0.45  0.18  0.02  0.03  0.01  0.22   0.03   0.02     0.02       0.01
# movie     0.02  0.12  0.51  0.08  0.11  0.02  0.04   0.05   0.02     0.02       0.01
# ...
# disappointing 0.03 0.02 0.08 0.02 0.18 0.12 0.02  0.11   0.05     0.32       0.05
#                                      ↑              ↑            ↑
#                                   "great"      "ending"    ITSELF

# INSIGHT: "disappointing" attends strongly to "great" (contrast)
#          and "ending" (what was disappointing)

Applications production

1. Recrutement IA (EU AI Act high-risk)

SCENARIO: CV Screening AI

SYSTEM: Resume parser + ML classifier
        Predicts: Invite interview (Yes/No)

CANDIDATE: Jane Smith
Resume: 5 years experience, CS degree, Python expert
PREDICTION: REJECTED (38% match)

WITHOUT XAI:
Candidate: "Why was I rejected?"
Company: "Our AI determined you're not a good fit."
→ UNACCEPTABLE (EU AI Act violation)

WITH XAI (SHAP):
EXPLANATION PROVIDED:
"Your application scored 38% (threshold 60%)

Positive factors (+28 points):
├── Python expertise: +12 pts
├── CS degree: +10 pts
├── Relevant experience: +6 pts

Negative factors (-32 points):
├── No leadership experience: -18 pts ★
├── Career gap (2021-2022): -8 pts
├── No cloud certifications: -6 pts

Recommendation: Gain leadership experience (manage projects)
                Consider AWS/Azure certifications"

→ TRANSPARENT, ACTIONABLE, COMPLIANT ✓

COMPANY BENEFIT:
✓ Legal compliance (EU AI Act)
✓ Reduced bias litigation risk
✓ Better candidate experience

2. Healthcare - Diagnostic IA

MEDICAL AI: Cancer detection (radiology)

INPUT: Chest X-ray
OUTPUT: 87% probability lung cancer

CLINICIAN QUESTION: "Why 87%? Which areas suspicious?"

XAI SOLUTION: Grad-CAM (Saliency map)

VISUALIZATION:
[X-ray image with heatmap overlay]
Red zones (high attention):
├── Upper right lobe: Nodule 2.3cm (primary concern)
├── Mediastinum: Lymph node enlargement (metastasis?)
└── Pleura: Subtle thickening (invasion?)

XAI : Comprendre les décisions IA

Pourquoi XAI critique ? :

Régulation : EU AI Act impose explicabilité systèmes high-risk

Trust : Users acceptent mieux décisions IA comprises

Debugging : Identifier biais, erreurs modèles

Fairness : Détecter discriminations (genre, ethnie, âge)

Safety : Vérifier raisonnement modèles critiques (santé, auto)

Techniques XAI :

SHAP (SHapley Additive exPlanations) : Contribution features

LIME (Local Interpretable Model-agnostic Explanations) : Explications locales

Saliency Maps : Visualisation importance pixels (images)

Attention Weights : Transformers (quels tokens modèle regarde)

Exigences réglementaires

Techniques XAI principales

# SHAP: Explique prédictions modèle ML import shap import xgboost as xgb import pandas as pd # 1. Train model (XGBoost - black box) X_train, y_train = load_data("credit_scoring.csv") # Features: income, age, credit_history, debt_ratio, etc. model = xgb.XGBClassifier().fit(X_train, y_train) # 2. SHAP Explainer explainer = shap.TreeExplainer(model) shap_values = explainer(X_train) # 3. Explain single prediction customer_id = 42 shap.plots.waterfall(shap_values[customer_id]) # Output visualization: # Base value: 0.32 (average approval rate) # + credit_history (good): +0.18 # + income (high): +0.12 # - debt_ratio (high): -0.08 # - age (young): -0.03 # = Final: 0.51 (APPROVED) # 4. Global feature importance shap.plots.beeswarm(shap_values) # Shows: credit_history most important feature overall # 5. Force plot (visual explanation) shap.plots.force(shap_values[customer_id]) # Interactive HTML: Red = push toward rejection, Blue = approval

Use case : Credit scoring

CUSTOMER APPLICATION: Name: John Doe Income: $75,000 Age: 28 Credit history: 3 years (good payment) Debt ratio: 42% (high) MODEL PREDICTION: APPROVED (51% confidence) SHAP EXPLANATION: Base rate: 32% approval (population average) Positive factors (+19 points): ├── Credit history good: +18 pts ★ (biggest factor) ├── Income >$70k: +12 pts └── Employment stable: +6 pts Negative factors (-17 pts): ├── Debt ratio 42%: -8 pts (concerning) ├── Age 28: -3 pts (limited history) └── No homeowner: -6 pts NET: +2 pts → 34% APPROVED BUSINESS VALUE: ✓ Customer understands why approved (transparency) ✓ Bank can justify decision (audit, regulation) ✓ Identify improvement areas (reduce debt → better terms)

# LIME: Explain any black-box classifier from lime import lime_text from sklearn.pipeline import make_pipeline from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.ensemble import RandomForestClassifier # 1. Train sentiment classifier (black box) texts_train, labels_train = load_sentiment_data() # Labels: 0 = negative, 1 = positive vectorizer = TfidfVectorizer() classifier = RandomForestClassifier(n_estimators=500) pipeline = make_pipeline(vectorizer, classifier) pipeline.fit(texts_train, labels_train) # 2. LIME Explainer explainer = lime_text.LimeTextExplainer(class_names=['negative', 'positive']) # 3. Explain prediction review = "This movie was terrible! Waste of time. Poor acting." prediction = pipeline.predict_proba([review])[0] # Output: [0.89, 0.11] → 89% negative # Generate explanation exp = explainer.explain_instance( review, pipeline.predict_proba, num_features=10 ) # Visualization exp.show_in_notebook() # Output: # Predicted: NEGATIVE (89%) # # Words contributing to NEGATIVE: # - "terrible": -0.34 ★★★ # - "waste": -0.21 ★★ # - "poor": -0.18 ★★ # # Words contributing to POSITIVE: # - "movie": +0.04 (neutral word, slight positive) # → Clear: "terrible", "waste", "poor" drove negative prediction

# Visualize Transformer attention (BERT, GPT) from transformers import BertTokenizer, BertModel import torch # 1. Load BERT tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertModel.from_pretrained('bert-base-uncased', output_attentions=True) # 2. Input text text = "The movie was great but the ending was disappointing." inputs = tokenizer(text, return_tensors="pt") # 3. Forward pass (get attentions) outputs = model(**inputs) attentions = outputs.attentions # 12 layers, 12 heads each # 4. Visualize attention (layer 11, head 3) tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'][0]) attention_weights = attentions[11][0, 3].detach().numpy() # Layer 11, head 3 # Heatmap: # [CLS] the movie was great but the ending was disappointing [SEP] # [CLS] 0.02 0.01 0.03 0.01 0.04 0.07 0.02 0.05 0.03 0.12 0.60 # the 0.01 0.45 0.18 0.02 0.03 0.01 0.22 0.03 0.02 0.02 0.01 # movie 0.02 0.12 0.51 0.08 0.11 0.02 0.04 0.05 0.02 0.02 0.01 # ... # disappointing 0.03 0.02 0.08 0.02 0.18 0.12 0.02 0.11 0.05 0.32 0.05 # ↑ ↑ ↑ # "great" "ending" ITSELF # INSIGHT: "disappointing" attends strongly to "great" (contrast) # and "ending" (what was disappointing)

Applications production

SCENARIO: CV Screening AI SYSTEM: Resume parser + ML classifier Predicts: Invite interview (Yes/No) CANDIDATE: Jane Smith Resume: 5 years experience, CS degree, Python expert PREDICTION: REJECTED (38% match) WITHOUT XAI: Candidate: "Why was I rejected?" Company: "Our AI determined you're not a good fit." → UNACCEPTABLE (EU AI Act violation) WITH XAI (SHAP): EXPLANATION PROVIDED: "Your application scored 38% (threshold 60%) Positive factors (+28 points): ├── Python expertise: +12 pts ├── CS degree: +10 pts ├── Relevant experience: +6 pts Negative factors (-32 points): ├── No leadership experience: -18 pts ★ ├── Career gap (2021-2022): -8 pts ├── No cloud certifications: -6 pts Recommendation: Gain leadership experience (manage projects) Consider AWS/Azure certifications" → TRANSPARENT, ACTIONABLE, COMPLIANT ✓ COMPANY BENEFIT: ✓ Legal compliance (EU AI Act) ✓ Reduced bias litigation risk ✓ Better candidate experience

MEDICAL AI: Cancer detection (radiology) INPUT: Chest X-ray OUTPUT: 87% probability lung cancer CLINICIAN QUESTION: "Why 87%? Which areas suspicious?" XAI SOLUTION: Grad-CAM (Saliency map) VISUALIZATION: [X-ray image with heatmap overlay] Red zones (high attention): ├── Upper right lobe: Nodule 2.3cm (primary concern) ├── Mediastinum: Lymph node enlargement (metastasis?) └── Pleura: Subtle thickening (invasion?)

IA explicable et transparence 2025 : XAI pour la confiance en production

Sommaire

XAI : Comprendre les décisions IA

Techniques XAI principales

SHAP : Contribution features

LIME : Explications locales

Attention Weights (Transformers)

Applications production

1. Recrutement IA (EU AI Act high-risk)

2. Healthcare - Diagnostic IA

Sources

À propos de Marie Laurent

Sommaire

Accélérez vos entraînements IA sur GPU

IA explicable et transparence 2025 : XAI pour la confiance en production

Sommaire

XAI : Comprendre les décisions IA

Techniques XAI principales

SHAP : Contribution features

LIME : Explications locales

Attention Weights (Transformers)

Applications production

1. Recrutement IA (EU AI Act high-risk)

2. Healthcare - Diagnostic IA

Sources

À propos de Marie Laurent

Sommaire

Accélérez vos entraînements IA sur GPU

Articles similaires

Articles similaires