Counterfactual Explanations for Machine Learning: A Review


Sahil Verma, John P. Dickerson, Keegan Hines

CoRR abs/2010.10596
2020

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.

Publications

Publications / Views

Home

Clouds
•  tags  •  authors  •  editors  •  journals  

Year
 2023    2022    2021    2020    2019    2018    2017    2016    2015    2014–1927

Sort
•  in journal  •  in proc  •  chapters  •  books  •  edited  •  spec issues  •  editorials  •  entries  •  manuals  •  tech reps  •  phd th  •  others  

Status
•  online  •  in press  •  proof  •  camera-ready  •  revised  •  accepted  •  revision  •  submitted  •  draft  •  note  

Services
•  ACM Digital Library  •  DBLP  •  IEEE Xplore  •  IRIS  •  PubMed  •  Google Scholar  •  Scopus  •  Semantic Scholar  •  Web of Science  •  DOI  

Publication

— authors

Sahil Verma, John P. Dickerson, Keegan Hines

— status

published

— sort

other

— publication date

2020

— journal

CoRR

— volume

abs/2010.10596

URLs

original page  |  original PDF  |  open access PDF

files

Open Access PDF

Partita IVA: 01131710376 — Copyright © 2008–2023 APICe@DISI – PRIVACY