GitHub - MAIF/shapash: 🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models (original) (raw)

tests pypi downloads pyversion license doc

🔍 Overview

Shapash is a Python library designed to make machine learning interpretable and comprehensible for everyone. It offers various visualizations with clear and explicit labels that are easily understood by all.

With Shapash, you can generate a Webapp that simplifies the comprehension of interactions between the model's features, and allows seamless navigation between local and global explainability. This Webapp enables Data Scientists to effortlessly understand their models and share their results with both data scientists and non-data experts.

Additionally, Shapash contributes to data science auditing by presenting valuable information about any model and data in a comprehensive report.

Shapash is suitable for Regression, Binary Classification and Multiclass problems. It is compatible with numerous models, including Catboost, Xgboost, LightGBM, Sklearn Ensemble, Linear models, and SVM. For other models, solutions to integrate Shapash are available; more details can be found here.

Shapash App Demo

🌱 Documentation and resources

🎉 What's new ?

Version New Feature Description Tutorial
2.3.x Additional dataset columns New demo Article In Webapp: Target and error columns added to dataset and possibility to add features outside the model for more filtering options
2.3.x Identity card New demo Article In Webapp: New identity card to summarize the information of the selected sample
2.2.x Picking samples Article New tab in the webapp for picking samples. The graph represents the "True Values Vs Predicted Values"
2.2.x Dataset Filter New tab in the webapp to filter data. And several improvements in the webapp: subtitles, labels, screen adjustments
2.0.x Refactoring Shapash Refactoring attributes of compile methods and init. Refactoring implementation for new backends
1.7.x Variabilize Colors Giving possibility to have your own colour palette for outputs adapted to your design
1.6.x Explainability Quality Metrics Article To help increase confidence in explainability methods, you can evaluate the relevance of your explainability using 3 metrics: Stability, Consistency and Compacity
1.4.x Groups of features Demo You can now regroup features that share common properties together. This option can be useful if your model has a lot of features.
1.3.x Shapash Report Demo A standalone HTML report that constitutes a basis of an audit document.

🔥 Features

Shapash provides concise and clear local explanations, It allows each user, enabling users of any Data background to understand a local prediction of a supervised model through a summarized and explicit explanation

We believe that this report will offer valuable support for auditing models and data, leading to improved AI governance. Data Scientists can now provide anyone interested in their project with a document that captures various aspects of their work as the foundation for an audit report. This document can be easily shared among teams (internal audit, DPO, risk, compliance...).

⚙️ How Shapash works

Shapash is an overlay package for libraries focused on model interpretability. It uses Shap or Lime backend to compute contributions.Shapash builds upon the various steps required to create a machine learning model, making the results more understandable.

Shapash is suitable for Regression, Binary Classification or Multiclass problem.
It is compatible with numerous models: Catboost, Xgboost, LightGBM, Sklearn Ensemble, Linear models, SVM.

If your model is not in the list of compatible models, it is possible to provide Shapash with local contributions calculated with shap or another method. Here's an example of how to provide contributions to Shapash. An issue has been created to enhance this use case.

Shapash can use category-encoders object, sklearn ColumnTransformer or simply features dictionary.

🛠 Installation

Shapash is intended to work with Python versions 3.9 to 3.12. Installation can be done with pip:

In order to generate the Shapash Report some extra requirements are needed. You can install these using the following command :

pip install shapash[report]

If you encounter compatibility issues you may check the corresponding section in the Shapash documentation here.

🕐 Quickstart

The 4 steps to display results:

from shapash import SmartExplainer

xpl = SmartExplainer( model=regressor, features_dict=house_dict, # Optional parameter preprocessing=encoder, # Optional: compile step can use inverse_transform method postprocessing=postprocess, # Optional: see tutorial postprocessing )

xpl.compile( x=xtest, y_pred=y_pred, # Optional: for your own prediction (by default: model.predict) y_target=yTest, # Optional: allows to display True Values vs Predicted Values additional_data=xadditional, # Optional: additional dataset of features for Webapp additional_features_dict=features_dict_additional, # Optional: dict additional data )

Live Demo Shapash-Monitor

xpl.generate_report( output_file="path/to/output/report.html", project_info_file="path/to/project_info.yml", x_train=xtrain, y_train=ytrain, y_test=ytest, title_story="House prices report", title_description="""This document is a data science report of the kaggle house prices tutorial project. It was generated using the Shapash library.""", metrics=[{"name": "MSE", "path": "sklearn.metrics.mean_squared_error"}], )

Report Example

predictor = xpl.to_smartpredictor()

See the tutorial part to know how to use the SmartPredictor object

📖 Tutorials

This github repository offers many tutorials to allow you to easily get started with Shapash.

Overview

Using postprocessing parameter in compile method

Using different backends

🤝 Contributors

🏆 Awards