GitHub - understandable-machine-intelligence-lab/Quantus: Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations (original) (raw)

A toolkit to evaluate neural network explanations

PyTorch and TensorFlow

Getting started! Launch Tutorials Python version PyPI version Code style: black Documentation Status codecov.io Downloads

Quantus is currently under active development so carefully note the Quantus release version to ensure reproducibility of your work.

📑 Shortcut to paper!

If you want to contribute/ improve/ extend Quantus, join our Discord!

News and Highlights! 🚀

Citation

If you find this toolkit or its companion paperQuantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyondinteresting or useful in your research, use the following Bibtex annotation to cite us:

@article{hedstrom2023quantus, author = {Anna Hedstr{"{o}}m and Leander Weber and Daniel Krakowczyk and Dilyara Bareeva and Franz Motzkus and Wojciech Samek and Sebastian Lapuschkin and Marina Marina M.{-}C. H{"{o}}hne}, title = {Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond}, journal = {Journal of Machine Learning Research}, year = {2023}, volume = {24}, number = {34}, pages = {1--11}, url = {http://jmlr.org/papers/v24/22-0142.html} }

When applying the individual metrics of Quantus, please make sure to also properly cite the work of the original authors (as linked below).

Table of contents

Library overview

A simple visual comparison of eXplainable Artificial Intelligence (XAI) methods is often not sufficient to decide which explanation method works best as shown exemplarily in Figure a) for four gradient-based methods — Saliency (Mørch et al., 1995; Baehrens et al., 2010), Integrated Gradients (Sundararajan et al., 2017), GradientShap (Lundberg and Lee, 2017) or FusionGrad (Bykov et al., 2021), yet it is a common practice for evaluation XAI methods in absence of ground truth data. Therefore, we developed Quantus, an easy-to-use yet comprehensive toolbox for quantitative evaluation of explanations — including 30+ different metrics.

With Quantus, we can obtain richer insights on how the methods compare e.g., b) by holistic quantification on several evaluation criteria and c) by providing sensitivity analysis of how a single parameter e.g. the pixel replacement strategy of a faithfulness test influences the ranking of the XAI methods.

Metrics

This project started with the goal of collecting existing evaluation metrics that have been introduced in the context of XAI research — to help automate the task of XAI quantification. Along the way of implementation, it became clear that XAI metrics most often belong to one out of six categories i.e., 1) faithfulness, 2) robustness, 3) localisation 4) complexity 5) randomisation (sensitivity) or 6) axiomatic metrics. The library contains implementations of the following evaluation metrics:

Faithfulnessquantifies to what extent explanations follow the predictive behaviour of the model (asserting that more important features play a larger role in model outcomes)

Additional metrics will be included in future releases. Please open an issue if you have a metric you believe should be apart of Quantus.

Disclaimers. It is worth noting that the implementations of the metrics in this library have not been verified by the original authors. Thus any metric implementation in this library may differ from the original authors. Further, bear in mind that evaluation metrics for XAI methods are often empirical interpretations (or translations) of qualities that some researcher(s) claimed were important for explanations to fulfil, so it may be a discrepancy between what the author claims to measure by the proposed metric and what is actually measured e.g., using entropy as an operationalisation of explanation complexity. Please read the user guidelines for further guidance on how to best use the library.

Installation

If you already have PyTorch or TensorFlow installed on your machine, the most light-weight version of Quantus can be obtained from PyPI as follows (no additional explainability functionality or deep learning framework will be included):

Alternatively, you can simply add the desired deep learning framework (in brackets) to have the package installed together with Quantus. To install Quantus with PyTorch, please run:

pip install "quantus[torch]"

For TensorFlow, please run:

pip install "quantus[tensorflow]"

Package requirements

The package requirements are as follows:

python>=3.8.0
torch>=1.11.0
tensorflow>=2.5.0

Please note that the exact PyTorch and/ or TensorFlow versions to be installed depends on your Python version (3.8-3.11) and platform (darwin, linux, …). See [project.optional-dependencies] section in the pyproject.toml file.

Getting started

The following will give a short introduction to how to get started with Quantus. Note that this example is based on the PyTorch framework, but we also supportTensorFlow, which would differ only in the loading of the model, data and explanations. To get started with Quantus, you need:

Let's first load the data and model. In this example, a pre-trained LeNet available from Quantus for the purpose of this tutorial is loaded, but generally, you might use any Pytorch (or TensorFlow) model instead. To follow this example, one needs to have quantus and torch installed, by e.g., pip install 'quantus[torch]'.

import quantus from quantus.helpers.model.models import LeNet import torch import torchvision from torchvision import transforms

Enable GPU.

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Load a pre-trained LeNet classification model (architecture at quantus/helpers/models).

model = LeNet() if device.type == "cpu": model.load_state_dict(torch.load("tests/assets/mnist", map_location=torch.device('cpu'))) else: model.load_state_dict(torch.load("tests/assets/mnist"))

Load datasets and make loaders.

test_set = torchvision.datasets.MNIST(root='./sample_data', download=True, transform=transforms.Compose([transforms.ToTensor()])) test_loader = torch.utils.data.DataLoader(test_set, batch_size=24)

Load a batch of inputs and outputs to use for XAI evaluation.

x_batch, y_batch = iter(test_loader).next() x_batch, y_batch = x_batch.cpu().numpy(), y_batch.cpu().numpy()

Step 2. Load explanations

We still need some explanations to evaluate. For this, there are two possibilities in Quantus. You can provide either:

  1. a set of re-computed attributions (np.ndarray)
  2. any arbitrary explanation function (callable), e.g., the built-in method quantus.explain or your own customised function

We show the different options below.

Using pre-computed explanations

Quantus allows you to evaluate explanations that you have pre-computed, assuming that they match the data you provide in x_batch. Let's say you have explanations for Saliency and Integrated Gradientsalready pre-computed.

In that case, you can simply load these into corresponding variables a_batch_saliencyand a_batch_intgrad:

a_batch_saliency = load("path/to/precomputed/saliency/explanations") a_batch_intgrad = load("path/to/precomputed/intgrad/explanations")

Another option is to simply obtain the attributions using one of many XAI frameworks out there, such as Captum,Zennit,tf.explain, or iNNvestigate. The following code example shows how to obtain explanations (Saliencyand Integrated Gradients, to be specific) using Captum:

import captum from captum.attr import Saliency, IntegratedGradients

Generate Integrated Gradients attributions of the first batch of the test set.

a_batch_saliency = Saliency(model).attribute(inputs=x_batch, target=y_batch, abs=True).sum(axis=1).cpu().numpy() a_batch_intgrad = IntegratedGradients(model).attribute(inputs=x_batch, target=y_batch, baselines=torch.zeros_like(x_batch)).sum(axis=1).cpu().numpy()

Save x_batch and y_batch as numpy arrays that will be used to call metric instances.

x_batch, y_batch = x_batch.cpu().numpy(), y_batch.cpu().numpy()

Quick assert.

assert [isinstance(obj, np.ndarray) for obj in [x_batch, y_batch, a_batch_saliency, a_batch_intgrad]]

Passing an explanation function

If you don't have a pre-computed set of explanations but rather want to pass an arbitrary explanation function that you wish to evaluate with Quantus, this option exists.

For this, you can for example rely on the built-in quantus.explain function to get started, which includes some popular explanation methods (please run quantus.available_methods() to see which ones). Examples of how to use quantus.explainor your own customised explanation function are included in the next section.

drawing

As seen in the above image, the qualitative aspects of explanations may look fairly uninterpretable --- since we lack ground truth of what the explanations should be looking like, it is hard to draw conclusions about the explainable evidence. To gather quantitative evidence for the quality of the different explanation methods, we can apply Quantus.

Step 3. Evaluate with Quantus

Quantus implements XAI evaluation metrics from different categories, e.g., Faithfulness, Localisation and Robustness etc which all inherit from the base quantus.Metric class. To apply a metric to your setting (e.g., Max-Sensitivity) it first needs to be instantiated:

metric = quantus.MaxSensitivity(nr_samples=10, lower_bound=0.2, norm_numerator=quantus.fro_norm, norm_denominator=quantus.fro_norm, perturb_func=quantus.uniform_noise, similarity_func=quantus.difference, abs=True, normalise=True)

and then applied to your model, data, and (pre-computed) explanations:

scores = metric( model=model, x_batch=x_batch, y_batch=y_batch, a_batch=a_batch_saliency, device=device, explain_func=quantus.explain, explain_func_kwargs={"method": "Saliency"}, )

Use quantus.explain

Since a re-computation of the explanations is necessary for robustness evaluation, in this example, we also pass an explanation function (explain_func) to the metric call. Here, we rely on the built-in quantus.explain function to recompute the explanations. The hyperparameters are set with the explain_func_kwargs dictionary. Please find more details on how to use quantus.explain at API documentation.

Employ customised functions

You can alternatively use your own customised explanation function (assuming it returns an np.ndarray in a shape that matches the input x_batch). This is done as follows:

def your_own_callable(model, models, targets, **kwargs) -> np.ndarray """Logic goes here to compute the attributions and return an explanation in the same shape as x_batch (np.array), (flatten channels if necessary).""" return explanation(model, x_batch, y_batch)

scores = metric( model=model, x_batch=x_batch, y_batch=y_batch, device=device, explain_func=your_own_callable )

Run large-scale evaluation

Quantus also provides high-level functionality to support large-scale evaluations, e.g., multiple XAI methods, multifaceted evaluation through several metrics, or a combination thereof. To utilise quantus.evaluate(), you simply need to define two things:

  1. The Metrics you would like to use for evaluation (each __init__ parameter configuration counts as its own metric):
    metrics = {
    "max-sensitivity-10": quantus.MaxSensitivity(nr_samples=10),
    "max-sensitivity-20": quantus.MaxSensitivity(nr_samples=20),
    "region-perturbation": quantus.RegionPerturbation(),
    }
  2. The XAI methods you would like to evaluate, e.g., a dict with pre-computed attributions:
    xai_methods = {
    "Saliency": a_batch_saliency,
    "IntegratedGradients": a_batch_intgrad
    }

You can then simply run a large-scale evaluation as follows (this aggregates the result by np.mean averaging):

import numpy as np results = quantus.evaluate( metrics=metrics, xai_methods=xai_methods, agg_func=np.mean, model=model, x_batch=x_batch, y_batch=y_batch, **{"softmax": False,} )

Please see Getting started tutorial to run code similar to this example. For more information on how to customise metrics and extend Quantus' functionality, please see Getting started guide.

Tutorials

Further tutorials are available that showcase the many types of analysis that can be done using Quantus. For this purpose, please see notebooks in the tutorials folder which includes examples such as:

... and more.

Contributing

We welcome any sort of contribution to Quantus! For a detailed contribution guide, please refer to Contributing documentation first.

If you have any developer-related questions, please open an issueor write us at hedstroem.anna@gmail.com.