Chang Sun
• Xiaofeng Zhang
abstract
In explainable artificial intelligence (XAI), counterfactual explanations
(CEs) act as an efficient post-hoc method to provide actionable insights
by identifying the minimal changes to input features that would alter
a model’s prediction, thereby illuminating its decision boundaries and
enhancing transparency. Large language models (LLMs) emerges as a
powerful tool for CEs generation due to its strong capability and good
common sense. In standard CEs generation process we need an Oracle
to evaluate the CEs generated by LLMs. But the Oracle itself can be
inaccurate and brings uncertainty. Hence in this work we aim to design
various metrics to evaluate CEs generated by different LLMs given the
truth that the Oracle is not reliable all the time.
outcomes