Residual Unfairness in Fair Machine Learning from Prejudiced Data (original) (raw)

There is no trade-off: enforcing fairness can improve accuracy

2020

One of the main barriers to the broader adoption of algorithmic fairness in machine learning is the trade-off between fairness and performance of ML models: many practitioners are unwilling to sacrifice the performance of their ML model for fairness. In this paper, we show that this trade-off may not be necessary. If the algorithmic biases in an ML model are due to sampling biases in the training data, then enforcing algorithmic fairness may improve the performance of the ML model on unbiased test data. We study conditions under which enforcing algorithmic fairness helps practitioners learn the Bayes decision rule for (unbiased) test data from biased training data. We also demonstrate the practical implications of our theoretical results in real-world ML tasks.

Residual Unfairness Under Disparate Benefit of the Doubt : Bias In , Bias Out

2018

Recent work in fairness in machine learning has proposed adjusting for fairness by equalizing accuracy metrics across groups and has also studied how datasets affected by historical prejudices may lead to unfair decision policies. We connect these lines of work and study the residual unfairness that arises when a fairness-adjusted predictor is not actually fair on the target population due to systematic censoring of training data by existing biased policies. This scenario is particularly common in the same applications where fairness is a concern. We characterize theoretically the impact of such censoring on standard fairness metrics for binary classifiers and provide criteria for when residual unfairness may or may not appear. We prove that, under certain conditions, fairnessadjusted classifiers will in fact induce residual unfairness that perpetuates the same injustices, against the same groups, that biased the data to begin with, thus showing that even state-of-theart fair machin...

A comparative study of fairness-enhancing interventions in machine learning

Proceedings of the Conference on Fairness, Accountability, and Transparency - FAT* '19, 2019

Computers are increasingly used to make decisions that have significant impact in people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions. Concretely, we present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures, and a large number of existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits), indicating that fairness interventions might be more brittle than previously thought.

Algorithmic Fairness and Bias in Machine Learning Systems

E3S web of conferences, 2023

In recent years, research into and concern over algorithmic fairness and bias in machine learning systems has grown significantly. It is vital to make sure that these systems are fair, impartial, and do not support discrimination or social injustices since machine learning algorithms are becoming more and more prevalent in decision-making processes across a variety of disciplines. This abstract gives a general explanation of the idea of algorithmic fairness, the difficulties posed by bias in machine learning systems, and different solutions to these problems. Algorithmic bias and fairness in machine learning systems are crucial issues in this regard that demand the attention of academics, practitioners, and policymakers. Building fair and unbiased machine learning systems that uphold equality and prevent discrimination requires addressing biases in training data, creating fairnessaware algorithms, encouraging transparency and interpretability, and encouraging diversity and inclusivity.

FAIRNESS IN MACHINE LEARNING: STATUS, SOFTWARE, AND SOLUTIONS

Although fairness in machine learning is a fairly nascent field, its relevance to daily life is immensely understated. Seeing as the future of everything from criminal justice to credit scoring to shopping lists will be dictated by the use of sophisticated machine learning predictors, it is imperative to understand the potential harms that fundamentally exist within these models. This paper begins with a broad overview of fairness from algorithmic as well as legal frameworks. Then, a taxonomy of the relevant software packages with respect to fairness and interpretability is provided in order to help facilitate usage. To conclude, a look at both current and future solutions to issues of fairness-aware systems is presented.

Evaluating Fairness Metrics in the Presence of Dataset Bias

arXiv (Cornell University), 2018

Data-driven algorithms play a large role in decision making across a variety of industries. Increasingly, these algorithms are being used to make decisions that have significant ramifications for people's social and economic well-being, e.g. in sentencing, loan approval, and policing. Amid the proliferation of such systems there is a growing concern about their potential discriminatory impact. In particular, machine learning systems which are trained on biased data have the potential to learn and perpetuate those biases. A central challenge for practitioners is thus to determine whether their models display discriminatory bias. Here we present a case study in which we frame the issue of bias detection as a causal inference problem with observational data. We enumerate two main causes of bias, sampling bias and label bias, and we investigate the abilities of six different fairness metrics to detect each bias type. Based on these investigations, we propose a set of best practice guidelines to select the fairness metric that is most likely to detect bias if it is present. Additionally, we aim to identify the conditions in which certain fairness metrics may fail to detect bias and instead give practitioners a false belief that their biased model is making fair decisions.

On the Applicability of Machine Learning Fairness Notions

2021

Machine Learning (ML) based predictive systems are increasingly used to support decisions with a critical impact on individuals' lives such as college admission, job hiring, child custody, criminal risk assessment, etc. As a result, fairness emerged as an important requirement to guarantee that ML predictive systems do not discriminate against specific individuals or entire sub-populations, in particular, minorities. Given the inherent subjectivity of viewing the concept of fairness, several notions of fairness have been introduced in the literature. This paper is a survey of fairness notions that, unlike other surveys in the literature, addresses the question of "which notion of fairness is most suited to a given real-world scenario and why?". Our attempt to answer this question consists in (1) identifying the set of fairness-related characteristics of the real-world scenario at hand, (2) analyzing the behavior of each fairness notion, and then (3) fitting these two e...

Non-empirical problems in fair machine learning

Ethics and Information Technology

The problem of fair machine learning has drawn much attention over the last few years and the bulk of offered solutions are, in principle, empirical. However, algorithmic fairness also raises important conceptual issues that would fail to be addressed if one relies entirely on empirical considerations. Herein, I will argue that the current debate has developed an empirical framework that has brought important contributions to the development of algorithmic decision-making, such as new techniques to discover and prevent discrimination, additional assessment criteria, and analyses of the interaction between fairness and predictive accuracy. However, the same framework has also suggested higher-order issues regarding the translation of fairness into metrics and quantifiable trade-offs. Although the (empirical) tools which have been developed so far are essential to address discrimination encoded in data and algorithms, their integration into society elicits key (conceptual) questions s...

Fairness Constraints: A Mechanism for Fair Classification

Automated data-driven decision systems are ubiquitous across a wide variety of online services, from online social networking and e-commerce to e-government. These systems rely on complex learning methods and vast amounts of data to optimize the service functionality, satisfaction of the end user and profitability. However, there is a growing concern that these automated decisions can lead to user discrimination, even in the absence of intent. In this paper, we introduce fairness constraints, a mechanism to ensure fairness in a wide variety of classifiers in a principled manner. Fairness prevents a classifier from outputting predictions correlated with certain sensitive attributes in the data. We then instantiate fairness constraints on three well-known classifiers -- logistic regression, hinge loss and support vector machines (SVM) -- and evaluate their performance in a real-world dataset with meaningful sensitive human attributes. Experiments show that fairness constraints allow f...

Residual Unfairness in Fair Machine Learning from Prejudiced Data (original) (raw)

Related papers