Data-centric Reliability Evaluation of Individual Predictions (original) (raw)

At the same time that AI and machine learning are becoming central to human life, their potential harms become more vivid. In the presence of such drawbacks, a critical question one needs to address before using these data-driven technologies to make a decision is whether to trust their outcomes. Aligned with recent efforts on data-centric AI, this paper proposes a novel approach to address the reliability question through the lens of data by associating data sets with distrust quantification that specifies their scope of use for predicting future query points. The distrust values raise warning signals when a prediction based on a dataset is questionable and are valuable alongside other techniques for trustworthy AI. We propose novel algorithms for efficiently and effectively computing distrust values. Learning the necessary components of the measures from the data itself, our sub-linear algorithms scale to very large and multi-dimensional settings. Furthermore, we design estimators to enable no-data access during the query time. Besides demonstrating the efficiency of our algorithms, our extensive experiments reflect a consistent correlation between distrust values and model performance. This highlights the necessity of dismissing prediction outcomes for cases with high distrust values, at least for critical decisions.