The Quarrel of Local Post-hoc Explainers for Moral Values Classification in Natural Language Processing

   page       BibTeX_logo.png   
Andrea Agiollo, Luciano C. Siebert, Pradeep K. Murukannaiah, Andrea Omicini
Davide Calvaresi, Amro Najjar, Andrea Omicini, Reyhan Aydoǧan, Rachele Carli, Giovanni Ciatto, Yazan Mualla, Kary Främling (a cura di)
Explainable and Transparent AI and Multi-Agent Systems, capitolo 6, pp. 97–115
Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence) 14127
Springer
settembre 2023

Although popular and effective, large language models (LLM) are characterised by a performance vs. transparency trade-off that hinders their applicability to sensitive scenarios. To address this issue, several approaches have been recently proposed by the XAI research community, mostly focusing on local post-hoc explanations. However, to the best of our knowledge, a thorough comparison among available explainability techniques is currently missing, mainly for the lack of a general metric to measure their benefits. We compare state-of-the-art local post-hoc explanation mechanisms for models trained over moral value classification tasks based on a measure of correlation. By relying on a novel framework for comparing global impact scores, our experiments show how most local post-hoc explainers are loosely correlated, and highlight huge discrepancies in their results—their ``quarrel'' about explanations. Finally, we compare the impact scores distribution obtained from each local post-hoc explainer with human-made dictionaries, showing alarmingly how there exists no correlation between explanation outputs and the concepts considered to be salient by humans.

parole chiaveNatural Language Processing . Moral Values Classification • eXplainable Artificial Intelligence • Local Post-hoc Explanations.
presentazione di riferimento
page_white_powerpointThe Quarrel of Local Post-hoc Explainers for Moral Values Classification in Natural Language Processing (EXTRAAMAS 2023@AAMAS 2023, 29/05/2023) — Andrea Agiollo (Andrea Agiollo, Luciano C. Siebert, Pradeep K. Murukannaiah, Andrea Omicini)
evento origine
worldEXTRAAMAS 2023@AAMAS 2023
rivista o collana
book Lecture Notes in Computer Science (LNCS)
pubblicazione contenitore
page_white_acrobatExplainable and Transparent AI and Multi-Agent Systems (curatela, 2023) — Davide Calvaresi, Amro Najjar, Andrea Omicini, Reyhan Aydoǧan, Rachele Carli, Giovanni Ciatto, Yazan Mualla, Kary Främling
progetto finanziatore
wrenchEXPECTATION — Personalized Explainable Artificial Intelligence for decentralized agents with heterogeneous knowledge (01/04/2021–31/03/2024)
funge da
pubblicazione di riferimento per presentazione
page_white_powerpointThe Quarrel of Local Post-hoc Explainers for Moral Values Classification in Natural Language Processing (EXTRAAMAS 2023@AAMAS 2023, 29/05/2023) — Andrea Agiollo (Andrea Agiollo, Luciano C. Siebert, Pradeep K. Murukannaiah, Andrea Omicini)