Sensitivity testing (fairness, robustness & safety) for text machine learning models — text_sensitivity documentation (original) (raw)

text_sensitivity

Text Sensitivity logo

PyPI Downloads Python_version Build_passing License Docs_passing Code style: black https://zenodo.org/badge/891502381.svg


Uses the generic architecture of text_explainability to also include tests of safety (how safe it the model in production, i.e. types of inputs it can handle), robustness (how generalizable the model is in production, e.g. stability when adding typos, or the effect of adding random unrelated data) and fairness (if equal individuals are treated equally by the model, e.g. subgroup fairness on sex and nationality).

© Marcel Robeer, 2021

Quick tour

Safety: test if your model is able to handle different data types.

from text_sensitivity import RandomAscii, RandomEmojis, combine_generators

Generate 10 strings with random ASCII characters

RandomAscii().generate_list(n=10)

Generate 5 strings with random ASCII characters and emojis

combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)

Robustness: if your model performs equally for different entities …

from text_sensitivity import RandomAddress, RandomEmail

Random address of your current locale (default = 'nl')

RandomAddress(sep=', ').generate_list(n=5)

Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is

RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)

… and if it is robust under simple perturbations.

from text_sensitivity import compare_accuracy from text_sensitivity.perturbation import to_upper, add_typos

Is model accuracy equal when we change all sentences to uppercase?

compare_accuracy(env, model, to_upper)

Is model accuracy equal when we add typos in words?

compare_accuracy(env, model, add_typos)

Fairness: see if performance is equal among subgroups.

from text_sensitivity import RandomName

Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)

RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)

Using text_sensitivity

Installation

Installation guide, directly installing it via pip or through the git.

Example Usage

An extended usage example.

text_sensitivity API reference

A reference to all classes and functions included in the text_sensitivity.

Development

text_sensitivity @ GIT

The git includes the open-source code and the most recent development version.

Changelog

Changes for each version are recorded in the changelog.

Contributing

Contributors to the open-source project and contribution guidelines.

Citation

@misc{text_sensitivity, title = {Python package text_sensitivity}, author = {Marcel Robeer}, howpublished = {\url{https://github.com/MarcelRobeer/text_sensitivity}}, doi = {10.5281/zenodo.14192941}, year = {2021} }

Credits

Indices and tables