Andrea Agiollo, Luciano C. Siebert, Pradeep K. Murukannaiah,
Andrea Omicini
Davide Calvaresi, Amro Najjar, Andrea Omicini, Reyhan Aydoǧan, Rachele Carli, Giovanni Ciatto, Yazan Mualla, Kary Främling (eds.)
Explainable and Transparent AI and Multi-Agent Systems, chapter 6, pages 97–115
Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence) 14127
Springer
September 2023
Although popular and effective, large language models (LLM) are characterised by a performance vs. transparency trade-off that hinders their applicability to sensitive scenarios. To address this issue, several approaches have been recently proposed by the XAI research community, mostly focusing on local post-hoc explanations. However, to the best of our knowledge, a thorough comparison among available explainability techniques is currently missing, mainly for the lack of a general metric to measure their benefits. We compare state-of-the-art local post-hoc explanation mechanisms for models trained over moral value classification tasks based on a measure of correlation. By relying on a novel framework for comparing global impact scores, our experiments show how most local post-hoc explainers are loosely correlated, and highlight huge discrepancies in their results—their ``quarrel'' about explanations. Finally, we compare the impact scores distribution obtained from each local post-hoc explainer with human-made dictionaries, showing alarmingly how there exists no correlation between explanation outputs and the concepts considered to be salient by humans.
keywords
Natural Language Processing . Moral Values Classification • eXplainable Artificial Intelligence • Local Post-hoc Explanations.
reference talk
origin event
journal or series
Lecture Notes in Computer Science
(LNCS)
container publication
funding project
EXPECTATION — Personalized Explainable Artificial Intelligence for decentralized agents with heterogeneous knowledge
(01/04/2021–31/03/2024)
works as
reference publication for talk