Sensitivity testing (fairness, robustness & safety) for text machine learning models — text_sensitivity documentation (original) (raw)

Text Sensitivity logo

Uses the generic architecture of text_explainability to also include tests of safety (how safe it the model in production, i.e. types of inputs it can handle), robustness (how generalizable the model is in production, e.g. stability when adding typos, or the effect of adding random unrelated data) and fairness (if equal individuals are treated equally by the model, e.g. subgroup fairness on sex and nationality).

Quick tour

Safety: test if your model is able to handle different data types.

from text_sensitivity import RandomAscii, RandomEmojis, combine_generators

Generate 10 strings with random ASCII characters

RandomAscii().generate_list(n=10)

Generate 5 strings with random ASCII characters and emojis

combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)

Robustness: if your model performs equally for different entities …

from text_sensitivity import RandomAddress, RandomEmail

Random address of your current locale (default = 'nl')

RandomAddress(sep=', ').generate_list(n=5)

Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is

RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)

… and if it is robust under simple perturbations.

from text_sensitivity import compare_accuracy from text_sensitivity.perturbation import to_upper, add_typos

Is model accuracy equal when we change all sentences to uppercase?

compare_accuracy(env, model, to_upper)

Is model accuracy equal when we add typos in words?

compare_accuracy(env, model, add_typos)

Fairness: see if performance is equal among subgroups.

from text_sensitivity import RandomName

Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)

RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)

Using text_sensitivity

Installation

Installation guide, directly installing it via pip or through the git.

Example Usage

An extended usage example.

text_sensitivity API reference

A reference to all classes and functions included in the text_sensitivity.

Development

text_sensitivity @ GIT

The git includes the open-source code and the most recent development version.

Changelog

Changes for each version are recorded in the changelog.

Contributing

Contributors to the open-source project and contribution guidelines.

Citation

@misc{text_sensitivity, title = {Python package text_sensitivity}, author = {Marcel Robeer}, howpublished = {\url{https://github.com/MarcelRobeer/text_sensitivity}}, doi = {10.5281/zenodo.14192941}, year = {2021} }

Credits

Edward Ma. NLP Augmentation. 2019.
Daniele Faraglia and other contributors. Faker. 2012.
Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin and Sameer Singh. Beyond Accuracy: Behavioral Testing of NLP models with CheckList. Association for Computational Linguistics (ACL). 2020.