[Paper] Summary of Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges

Paper: Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges by Christoph Molnar, Giuseppe Casalicchio, Bernd Bischl.

Interesting for the list of papers referenced.

The classification of the methods in this paper somewhat feels weird to me, and I sort of disagree with it (post-reading Ian Covert’s paper: ‘Explaining by Removing: A Unified Framework for Model Explanation’), but that doesn’t matter too much: The field is not very unified yet despite some start of convergence on terms.

Interpretable Machine Learning (IML) methods:

analyze model components,
study sensitivity to input perturbations,
analyze local or global surrogate approximations of the ML model.

Remaining challenges for IML:

dealing with dependent features,
causal interpretation,
uncertainty estimation,
missing rigorous definition of interpretability.

1. Introduction

Interpretable machine learning (IML) methods can be used to discover knowledge, to debug or justify the model and its predictions, and to control and improve the model.

2. Brief History of IML

model-agnostic explanation methods
model-specific explanation methods (for deep neural networks or tree ensembles)

3. Today

permutation feature importance
Shapley values
counterfactual explanations
partial dependence plots
saliency maps

Open source implementations of various IML methods:

iml (R)
DALEX (R)
Alibi (Python)
InterpretML (Python)

4. ML Methods

analyzing components of interpretable models (e.g. linear regressions and trees)
analyzing components of more complex models (e.g. visualizing feature maps of a CNN)
explaining individual predictions (e.g. Shapley values and counterfactual explanations)
explaining global model behavior
- feature importance (ranks features based on how relevant they were for the prediction)
- feature effect (expresses how a change in a feature changes the predicted outcome, e.g. partial dependence plots, individual conditional expectation curves)
using surrogate models (analyzing the components of the interpretable surrogate model, e.g. LIME)

5. Challenges

statistical uncertainty and inference: methods such as feature importance or Shapley values provide explanations without quantifying the uncertainty of the explanation; most IML methods for feature importance are not adapted for multiple testing
causal interpretation; some early research on causality x (permutation feature importance, Shapley values)
feature dependence
the very (lack of) definition of interpretability (but various quantifiable aspects of interpretability are emerging, e.g. sparsity, interaction strength, fidelity, sensitivity to perturbations, simulatability; once again, nothing is unified yet) – authors suggest to get inspiration from the field of human-computer interaction
easy-to-understand explanations… for now, you need to be a specialist to understand and interpret them approximately correctly; not meant to be understood by an end-user yet.

Cite:

Molnar, Christoph, Giuseppe Casalicchio, and Bernd Bischl. “Interpretable Machine Learning–A Brief History, State-of-the-Art and Challenges.” arXiv preprint arXiv:2010.09337 (2020).

Books to read to get up to speed with the field:

Interpretable Machine Learning (from the author of the paper)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Convex Optimization
A Course on Cooperative Game Theory
Causality