Sensitivity testing (fairness, robustness & safety) for text machine learning models — text_sensitivity documentation (original) (raw)
Uses the generic architecture of text_explainability
to also include tests of safety (how safe it the model in production, i.e. types of inputs it can handle), robustness (how generalizable the model is in production, e.g. stability when adding typos, or the effect of adding random unrelated data) and fairness (if equal individuals are treated equally by the model, e.g. subgroup fairness on sex and nationality).
© Marcel Robeer, 2021
Quick tour
Safety: test if your model is able to handle different data types.
from text_sensitivity import RandomAscii, RandomEmojis, combine_generators
Generate 10 strings with random ASCII characters
RandomAscii().generate_list(n=10)
Generate 5 strings with random ASCII characters and emojis
combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)
Robustness: if your model performs equally for different entities …
from text_sensitivity import RandomAddress, RandomEmail
Random address of your current locale (default = 'nl')
RandomAddress(sep=', ').generate_list(n=5)
Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is
RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)
… and if it is robust under simple perturbations.
from text_sensitivity import compare_accuracy from text_sensitivity.perturbation import to_upper, add_typos
Is model accuracy equal when we change all sentences to uppercase?
compare_accuracy(env, model, to_upper)
Is model accuracy equal when we add typos in words?
compare_accuracy(env, model, add_typos)
Fairness: see if performance is equal among subgroups.
from text_sensitivity import RandomName
Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)
RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)
Using text_sensitivity
Installation guide, directly installing it via pip or through the git.
An extended usage example.
text_sensitivity API reference
A reference to all classes and functions included in the text_sensitivity
.
Development
The git includes the open-source code and the most recent development version.
Changes for each version are recorded in the changelog.
Contributors to the open-source project and contribution guidelines.
Citation
@misc{text_sensitivity, title = {Python package text_sensitivity}, author = {Marcel Robeer}, howpublished = {\url{https://github.com/MarcelRobeer/text_sensitivity}}, doi = {10.5281/zenodo.14192941}, year = {2021} }
Credits
- Edward Ma. NLP Augmentation. 2019.
- Daniele Faraglia and other contributors. Faker. 2012.
- Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin and Sameer Singh. Beyond Accuracy: Behavioral Testing of NLP models with CheckList. Association for Computational Linguistics (ACL). 2020.