Better Metrics for Evaluating Explainable Artificial Intelligence Blue Sky Ideas Track (original) (raw)
This paper presents objective metrics for how explainable artificial intelligence (XAI) can be quantified. Through an overview of current trends, we show that many explanations are generated post-hoc and independent of the agent's logical process, which in turn creates explanations with limited meaning as they lack transparency and fidelity. While user studies are a known basis for evaluating XAI, studies that do not consider objective metrics for evaluating XAI may have limited meaning and may suffer from confirmation bias, particularly if they use low fidelity explanations unnecessarily. To avoid this issue, this paper suggests a paradigm shift in evaluating XAI that focuses on metrics that quantify the explanation itself and its appropriateness given the XAI goal. We suggest four such metrics based on performance differences, D, between the explanation's logic and the agent's actual performance, the number of rules, R, outputted by the explanation, the number of features, F , used to generate that explanation, and the stability, S, of the explanation. We believe that user studies that focus on these metrics in their evaluations are inherently more valid and should be integrated in future XAI research.
Sign up for access to the world's latest research.
checkGet notified about relevant papers
checkSave papers to use in your research
checkJoin the discussion with peers
checkTrack your impact