Fairness Constraint

A fairness constraint is a mathematical condition imposed on a machine learning model during training, evaluation, or post-processing to ensure that its predictions or decisions satisfy a specified fairness criterion with respect to protected attributes such as race, gender, age, or disability status. Fairness constraints formalize the intuition that automated systems should not systematically disadvantage particular demographic groups, and they provide a principled mechanism for balancing predictive accuracy against equitable outcomes.

Fairness constraints have become a central topic in responsible artificial intelligence research and practice. They appear in hiring algorithms, credit scoring systems, criminal justice risk assessments, and healthcare resource allocation, where biased predictions can cause measurable harm to individuals and communities.

Explain like I'm 5 (ELI5)

Imagine you are dividing candy among your friends. You want to be fair, so you make a rule: everyone should get roughly the same amount, no matter what color shirt they are wearing. A fairness constraint in machine learning works the same way. When a computer learns to make decisions (like who gets a loan or who gets called for a job interview), a fairness constraint is a rule that tells the computer it cannot give better results to one group of people over another just because of something like their gender or race. There are different rules for different kinds of fairness, and the computer has to follow whichever rule the designer picks while still trying to make good decisions overall.

Background and motivation

Machine learning models learn patterns from historical data. When that data reflects past societal biases, the resulting models can replicate or amplify those biases. For example, a resume screening tool trained on historical hiring decisions may learn to penalize female applicants if the training data reflects a period when women were hired at lower rates. Similarly, a recidivism prediction model may assign higher risk scores to individuals from certain racial groups if the underlying arrest data is skewed by discriminatory policing practices.

Fairness constraints address this problem by encoding equity requirements directly into the learning process. Rather than relying solely on post-hoc audits, they allow practitioners to specify what "fair" means in mathematical terms and to optimize the model subject to those requirements.

The formal study of fairness constraints accelerated after several high-profile incidents. In 2016, ProPublica reported that the COMPAS recidivism prediction tool exhibited racial disparities, with Black defendants roughly twice as likely as white defendants to be incorrectly flagged as high risk. The same year, researchers demonstrated that word embeddings trained on large text corpora encoded gender stereotypes (for instance, associating "nurse" with female and "engineer" with male). These cases motivated a wave of research on how to formalize and enforce fairness in algorithmic systems.

Fairness definitions used as constraints

A fairness constraint requires a precise mathematical definition of what constitutes fair behavior. Multiple definitions have been proposed, each capturing a different aspect of fairness. The table below summarizes the most widely used definitions.

Fairness definition	Mathematical condition	Intuition	Introduced by
Demographic parity	P(Ŷ = 1 \| A = a) = P(Ŷ = 1 \| A = b) for all groups a, b	Positive prediction rates are equal across groups	Calders and Verwer (2010)
Equalized odds	P(Ŷ = 1 \| A = a, Y = y) = P(Ŷ = 1 \| A = b, Y = y) for all y	True positive and false positive rates are equal across groups	Hardt, Price, and Srebro (2016)
Equality of opportunity	P(Ŷ = 1 \| A = a, Y = 1) = P(Ŷ = 1 \| A = b, Y = 1)	True positive rates are equal across groups (relaxation of equalized odds)	Hardt, Price, and Srebro (2016)
Predictive parity	P(Y = 1 \| Ŷ = 1, A = a) = P(Y = 1 \| Ŷ = 1, A = b)	Positive predictive values are equal across groups	Chouldechova (2017)
Calibration	P(Y = 1 \| S = s, A = a) = P(Y = 1 \| S = s, A = b) for all scores s	Predicted probabilities reflect true outcome rates equally across groups	Kleinberg, Mullainathan, and Raghavan (2016)
Individual fairness	d(f(x_i), f(x_j)) ≤ L · d(x_i, x_j)	Similar individuals receive similar predictions (Lipschitz condition)	Dwork, Hardt, Pitassi, Reingold, and Zemel (2012)
Counterfactual fairness	P(Ŷ{A←a} \| X = x, A = a) = P(Ŷ{A←b} \| X = x, A = a)	Prediction does not change in a counterfactual world where the individual belongs to a different group	Kusner, Loftus, Russell, and Silva (2017)
Disparate impact ratio	P(Ŷ = 1 \| A = unprivileged) / P(Ŷ = 1 \| A = privileged) ≥ 0.8	Selection rate for the unprivileged group is at least 80% of the privileged group (the "four-fifths rule")	EEOC (1978), adapted for ML

In this table, Ŷ denotes the model's prediction, Y denotes the true outcome, A denotes the sensitive (protected) attribute, S denotes a score or predicted probability, and f denotes the classifier mapping.

Group fairness versus individual fairness

Group fairness definitions (demographic parity, equalized odds, equality of opportunity, predictive parity, calibration) require statistical properties to hold at the level of demographic groups. Individual fairness, by contrast, requires that any two individuals who are similar with respect to the task at hand receive similar predictions. The similarity is measured by a task-specific distance metric, and the constraint is expressed as a Lipschitz condition on the classifier.

Group fairness is easier to measure and enforce because it only requires aggregate statistics. Individual fairness is more demanding because it requires specifying what "similar" means for each pair of individuals, which itself may be contested. In practice, most fairness constraint methods target group fairness definitions.

Impossibility results

A set of impossibility results, demonstrated independently by Chouldechova (2017) and Kleinberg, Mullainathan, and Raghavan (2016), shows that several common fairness definitions cannot be satisfied simultaneously except in trivial cases. Specifically, when the base rates (the proportion of positive outcomes) differ across groups, it is impossible for a classifier to simultaneously achieve calibration, false positive rate balance (equal false positive rates), and false negative rate balance (equal false negative rates), unless the classifier is a perfect predictor.

These impossibility results mean that practitioners must choose which fairness definition to prioritize. The choice depends on the application context, the stakeholders involved, and the specific harms that the system could cause.

Methods for enforcing fairness constraints

Techniques for incorporating fairness constraints into machine learning fall into three broad categories based on when the intervention occurs in the modeling pipeline.

Pre-processing methods

Pre-processing methods modify the training data before the model is trained. The goal is to remove or reduce bias in the data so that a standard (unconstrained) learning algorithm produces fair predictions.

Reweighing. Proposed by Kamiran and Calders (2012), reweighing assigns instance-level weights to the training data such that the weighted distribution satisfies demographic parity with respect to the protected attribute. Instances from underrepresented (group, label) combinations receive higher weights, and instances from overrepresented combinations receive lower weights. The classifier then trains on the reweighted data using standard methods.

Disparate impact remover. Proposed by Feldman, Friedler, Moeller, Scheidegger, and Venkatasubramanian (2015), this method transforms feature values so that the distributions of each feature are the same across groups, while preserving within-group ranking. The transformed data can then be used with any classifier.

Learning fair representations. Proposed by Zemel, Wu, Swersky, Pitassi, and Dwork (2013), this approach learns a latent representation of the data that encodes the information needed for prediction while being statistically independent of the protected attribute. Any classifier trained on this representation inherits the fairness property.

Pre-processing methods are attractive because they are model-agnostic: once the data is transformed, any learning algorithm can be applied. However, they may lose information that is useful for prediction, and they do not provide formal guarantees on the fairness of the downstream classifier.

In-processing methods

In-processing methods incorporate fairness constraints directly into the model training procedure. They modify the optimization problem that the learning algorithm solves so that the resulting model satisfies the desired fairness criterion.

Regularization-based approaches

The simplest in-processing strategy adds a fairness penalty term to the loss function. The modified objective takes the form:

min_w L(w) + lambda * F(w)

where L(w) is the standard loss (for example, cross-entropy for classification), F(w) is a measure of unfairness (for example, the difference in positive prediction rates across groups), and lambda is a hyperparameter controlling the strength of the fairness penalty.

This approach is simple to implement and compatible with standard gradient descent optimizers. However, it does not guarantee that the fairness constraint is satisfied exactly; it only encourages the model in the direction of fairness. The level of fairness achieved depends on the choice of lambda, which must be tuned empirically.

Zafar, Valera, Gomez-Rodriguez, and Gummadi (2017) proposed a decision-boundary fairness mechanism that adds covariance-based fairness constraints to logistic regression and support vector machine classifiers. Their method provides fine-grained control over the trade-off between accuracy and fairness.

Constrained optimization approaches

Constrained optimization formulates the fairness requirement as a hard constraint rather than a soft penalty:

min_w L(w) subject to F(w) ≤ epsilon

where epsilon is a tolerance parameter specifying how much unfairness is acceptable. This formulation is mathematically more principled than regularization because it separates the accuracy objective from the fairness requirement, and the tolerance epsilon has a direct interpretation.

Solving this constrained optimization problem is challenging because fairness constraints are typically non-convex and non-differentiable (they involve indicator functions over model predictions and group memberships). Several algorithmic frameworks have been developed to handle these challenges.

The exponentiated gradient method

Agarwal, Beygelzimer, Dudik, Langford, and Wallach (2018) proposed a reductions approach to fair classification that converts the constrained fairness problem into a sequence of cost-sensitive classification problems. The method uses the exponentiated gradient algorithm to iteratively adjust dual variables (Lagrange multipliers) corresponding to the fairness constraints.

The algorithm works as follows:

Initialize the Lagrange multipliers (dual variables) uniformly.
At each iteration, solve a cost-sensitive classification problem with costs determined by the current multipliers.
Update the multipliers using an exponentiated gradient step, increasing multipliers for violated constraints and decreasing multipliers for satisfied constraints.
Return a randomized classifier that mixes the classifiers obtained across iterations.

The exponentiated gradient method supports demographic parity, equalized odds, and other linear fairness constraints. It produces a randomized classifier (a distribution over at most T base classifiers, where T is the number of iterations), and it comes with finite-sample convergence guarantees: the resulting classifier is approximately optimal and approximately feasible. This approach is implemented in Microsoft's Fairlearn library.

The proxy-Lagrangian method

Cotter, Jiang, Gupta, Wang, Narayan, You, and Sridharan (2019) addressed the problem that many fairness constraints are non-differentiable (for example, constraints on false positive rates involve indicator functions). Since standard Lagrangian methods require differentiable constraints, they introduced the proxy-Lagrangian formulation.

The proxy-Lagrangian replaces the original non-differentiable constraints with smooth, differentiable proxy constraints for the purpose of gradient computation, while the original constraints are still used by the dual player (the "auditor") to update the Lagrange multipliers. This leads to a two-player non-zero-sum game:

Player 1 (the learner) minimizes the Lagrangian with respect to model parameters, using proxy constraints.
Player 2 (the auditor) maximizes the Lagrangian with respect to the Lagrange multipliers, using the original constraints.

The solution concept is a semi-coarse correlated equilibrium. The resulting stochastic classifier can be expressed as a mixture of at most m+1 deterministic classifiers, where m is the number of constraints. This method is particularly useful for deep learning models where the loss landscape is non-convex.

Adversarial debiasing

Zhang, Lemoine, and Mitchell (2018) proposed an in-processing method based on adversarial learning. The approach trains two networks simultaneously:

A predictor network that maps inputs to predictions of the target variable.
An adversary network that takes the predictor's output and attempts to predict the protected attribute.

The predictor is trained to maximize prediction accuracy while minimizing the adversary's ability to infer the protected attribute from the predictions. The gradient update for the predictor subtracts the projection of the adversary's gradient, pushing the predictor toward representations that are informative for the task but uninformative about group membership.

Adversarial debiasing can approximately enforce demographic parity (by making predictions independent of the protected attribute) or equalized odds (by conditioning the adversary on the true label). It is implemented in IBM's AI Fairness 360 toolkit.

Post-processing methods

Post-processing methods adjust the predictions of a trained model to satisfy fairness constraints, without modifying the training procedure itself.

Threshold adjustment. Hardt, Price, and Srebro (2016) showed that for any binary classifier producing a continuous score, one can find group-specific thresholds that convert the scores into binary predictions satisfying equalized odds. The method solves a linear program to find the optimal thresholds that maximize accuracy subject to the equalized odds constraint. This approach is model-agnostic and requires access only to the model's predictions and the protected attribute values.

Reject option classification. Kamiran, Karim, and Zhang (2012) proposed changing predictions for instances near the decision boundary. For individuals in the unprivileged group who are close to the boundary and receive a negative prediction, the prediction is flipped to positive. For individuals in the privileged group who are close to the boundary and receive a positive prediction, the prediction is flipped to negative.

Calibrated equalized odds. Pleiss, Raghavan, Wu, Kleinberg, and Weinberger (2017) developed a post-processing method that adjusts a calibrated classifier to satisfy equalized odds while preserving calibration as much as possible.

Post-processing methods have the advantage of being applicable to any existing model without retraining. However, they generally provide weaker fairness guarantees than in-processing methods because they cannot change the model's internal representations.

Comparison of enforcement approaches

The following table compares the three categories of fairness constraint enforcement.

Property	Pre-processing	In-processing	Post-processing
When applied	Before training	During training	After training
Model access required	Data only	Full model and training loop	Predictions only
Model-agnostic	Yes	No (method-specific)	Yes
Formal guarantees	Typically weak	Can be strong (e.g., exponentiated gradient)	Moderate (e.g., linear programming for threshold adjustment)
Computational cost	Low to moderate	Moderate to high	Low
Accuracy impact	Can degrade if too much information is removed	Can be controlled via constraint tolerance	Minimal if original model is strong
Supported fairness definitions	Primarily demographic parity	Flexible (DP, EO, equality of opportunity, etc.)	Primarily equalized odds, demographic parity
Example methods	Reweighing, disparate impact remover, fair representations	Exponentiated gradient, adversarial debiasing, regularization	Threshold adjustment, reject option, calibrated EO

Mathematical formulation

This section provides a more detailed mathematical treatment of constrained optimization for fairness.

The constrained learning problem

Let X denote the feature space, Y the label space (Y = {0, 1} for binary classification), and A the set of protected attribute values. Let D be the joint distribution over (X, A, Y). Let H be a hypothesis class of classifiers h: X -> Y.

The goal is to find a classifier h* that minimizes the expected loss while satisfying fairness constraints:

h* = argmin_{h in H} E_{(x,a,y) ~ D} [l(h(x), y)]

subject to: g_j(h, D) ≤ epsilon_j for j = 1, ..., m

where l is a loss function, g_j are constraint functions encoding fairness requirements, and epsilon_j are tolerance parameters.

For demographic parity, the constraint function might be:

g(h, D) = |P(h(X) = 1 | A = a) - P(h(X) = 1 | A = b)|

For equalized odds, there are two constraint functions (one for each value of Y):

g_y(h, D) = |P(h(X) = 1 | A = a, Y = y) - P(h(X) = 1 | A = b, Y = y)| for y in {0, 1}

Lagrangian relaxation

The standard approach to constrained optimization is Lagrangian relaxation. Define the Lagrangian:

L(h, lambda) = E[l(h(x), y)] + sum_{j=1}^{m} lambda_j * g_j(h, D)

where lambda = (lambda_1, ..., lambda_m) are non-negative Lagrange multipliers. The constrained problem is equivalent to the saddle-point problem:

min_h max_{lambda ≥ 0} L(h, lambda)

Under certain convexity conditions, strong duality holds and the saddle-point solution coincides with the constrained optimum. For non-convex problems (such as training neural networks), strong duality may not hold, but the Lagrangian relaxation still provides a useful algorithmic framework.

Convergence and feasibility guarantees

The exponentiated gradient method of Agarwal et al. (2018) provides the following guarantee: after T iterations, the algorithm returns a randomized classifier whose expected loss is at most OPT + O(1/sqrt(T)) and whose constraint violation is at most O(1/sqrt(T)), where OPT is the optimal loss among all feasible classifiers. These are finite-sample bounds that hold for a fixed dataset.

The proxy-Lagrangian method of Cotter et al. (2019) achieves similar convergence rates for non-convex, non-differentiable constraints, at the cost of requiring smooth proxy constraints.

Fairness constraints in gradient boosting

Gradient boosted decision trees (GBDT) are widely used in high-stakes applications like credit scoring and fraud detection. Standard GBDT implementations (XGBoost, LightGBM, CatBoost) do not natively support fairness constraints. Several methods have been proposed to fill this gap.

FairGBM. Cruz, Belem, Jesus, Bravo, Saleiro, and Bizarro (2023) developed FairGBM, a framework for training gradient boosted trees with fairness constraints. FairGBM uses smooth convex proxies for non-differentiable fairness metrics and optimizes them using a dual ascent (proxy-Lagrangian) procedure. The method is implemented on top of LightGBM and supports demographic parity, equalized odds, and equality of opportunity. Experiments show that FairGBM achieves comparable predictive performance to unconstrained GBDT while satisfying fairness constraints, with training times that are an order of magnitude faster than alternative fair ML methods.

Fair regularization for GBDT. An alternative approach adds fairness-aware regularization terms to the GBDT objective function, penalizing splits that increase group disparity. This is conceptually simpler but provides weaker control over the fairness-accuracy trade-off.

Software toolkits

Several open-source libraries implement fairness constraint methods. The table below compares the major toolkits.

Toolkit	Developer	Language	Key fairness constraint methods	Fairness metrics
Fairlearn	Microsoft Research	Python	ExponentiatedGradient, GridSearch, ThresholdOptimizer, CorrelationRemover, AdversarialFairnessClassifier	Demographic parity, equalized odds, bounded group loss, error rate parity
AI Fairness 360 (AIF360)	IBM Research	Python, R	Reweighing, adversarial debiasing, prejudice remover, meta-fair classifier, reject option classification, calibrated equalized odds	Disparate impact, statistical parity difference, equal opportunity difference, average odds difference, Theil index
FairGBM	Feedzai Research	Python (C++ core)	Proxy-Lagrangian dual ascent for GBDT	Demographic parity, equalized odds, equality of opportunity
Themis-ML	Various	Python	Relabeling, additive counterfactually fair model	Demographic parity
What-If Tool	Google	JavaScript/Python	Threshold adjustment (interactive)	Multiple (user-configurable)

Real-world applications

Fairness constraints have been applied in numerous domains where algorithmic decisions affect people's lives.

Criminal justice

Recidivism prediction instruments like COMPAS are used by courts and parole boards across the United States to estimate the likelihood of reoffense. After ProPublica's 2016 analysis revealed racial disparities in COMPAS scores, researchers applied equalized odds and calibration constraints to recidivism prediction models. Studies have shown that fairness constraints can reduce racial disparities in false positive rates without a large cost in overall predictive accuracy.

Lending and credit

Credit scoring models determine who receives loans and at what interest rates. The Equal Credit Opportunity Act in the United States prohibits discrimination based on race, sex, and other protected attributes. Fairness constraints, particularly demographic parity and equalized odds, have been applied to ensure that approval rates and error rates do not systematically differ across racial or gender groups. Microsoft Research and Ernst & Young have demonstrated the use of Fairlearn's exponentiated gradient method for fair credit modeling.

Hiring and recruitment

Automated resume screening and candidate ranking tools can perpetuate historical hiring biases. In 2018, Amazon discontinued an internal hiring tool after discovering it penalized resumes containing the word "women's" (as in "women's chess club"). Fairness constraints such as demographic parity can ensure that the selection rate for each demographic group meets a minimum threshold, consistent with the four-fifths rule used by the U.S. Equal Employment Opportunity Commission.

Healthcare

Algorithms that allocate healthcare resources (such as prioritizing patients for care management programs) have been found to exhibit racial bias. Obermeyer, Powers, Vogeli, and Mullainathan (2019) showed that a widely used healthcare algorithm exhibited significant racial bias because it used healthcare cost as a proxy for health need. Applying fairness constraints to equalize positive prediction rates or error rates across racial groups can help mitigate such disparities.

The fairness-accuracy trade-off

Imposing fairness constraints generally reduces the model's predictive accuracy relative to an unconstrained model. This happens because the unconstrained optimum may involve predictions that are correlated with the protected attribute, and the fairness constraint forces the model away from this optimum.

The magnitude of the trade-off depends on several factors:

The strength of the constraint. Tighter constraints (smaller epsilon) impose a larger accuracy cost.
The base rate difference. When the outcome rates differ substantially across groups, achieving equalized odds or demographic parity requires larger deviations from the unconstrained optimum.
The informativeness of the protected attribute. If the protected attribute is strongly correlated with the target variable (possibly through legitimate features), removing its influence has a larger accuracy cost.
The choice of fairness definition. Some definitions are more restrictive than others. For example, equalized odds constrains both false positive and false negative rates, while equality of opportunity constrains only the false negative rate (or equivalently, the true positive rate).

Empirically, the trade-off is often small for moderate levels of fairness. For example, Agarwal et al. (2018) showed that their exponentiated gradient method achieved near-zero constraint violation with less than 1-2 percentage points of accuracy loss on several benchmark datasets. However, in cases with large base rate differences, the trade-off can be substantial.

Limitations and open challenges

Despite significant progress, fairness constraints have several limitations.

Choice of fairness definition. The impossibility results of Chouldechova (2017) and Kleinberg et al. (2016) show that no single classifier can satisfy all common fairness definitions simultaneously. Practitioners must choose which definition to enforce, and this choice is inherently a normative judgment rather than a purely technical one.

Intersectionality. Most fairness constraint methods consider protected attributes independently (for example, enforcing equal rates separately for race and gender). Enforcing fairness across intersectional groups (for example, Black women or elderly disabled individuals) is more challenging because the number of groups grows combinatorially, and sample sizes within each intersectional group may be small. Cotter, Gupta, Jiang, Srebro, Sridharan, Wang, Woodworth, and You (2019) proposed learning multiplier models to handle large numbers of constraints arising from intersectional groups.

Distributional shift. Fairness constraints are typically enforced on the training distribution. If the deployment distribution differs (for instance, because of population changes or feedback loops), the trained model may violate the fairness constraints on the new distribution.

Fairness gerrymandering. Kearns, Neel, Roth, and Wu (2018) showed that a classifier can satisfy fairness constraints for a predefined set of groups while being unfair to subgroups not included in the constraint set. They proposed algorithms for preventing this by ensuring fairness over a rich class of subgroups.

Proxy attributes. Even when a model does not directly use the protected attribute, other features may serve as proxies (for example, zip code as a proxy for race). Fairness constraints that operate only on the explicit protected attribute may not capture discrimination mediated through proxy features.

Causal reasoning. Observational fairness definitions (demographic parity, equalized odds) do not distinguish between legitimate and illegitimate uses of the protected attribute. Causal fairness definitions (such as counterfactual fairness) attempt to address this but require a causal model of the data generating process, which is often unavailable or contested.

Historical timeline

The following table traces the development of key ideas in fairness constraints for machine learning.

Year	Development	Authors
2008	Formalization of discrimination-aware data mining	Pedreshi, Ruggieri, and Turini
2010	Demographic parity for classifiers	Calders and Verwer
2012	Individual fairness via Lipschitz constraints ("Fairness Through Awareness")	Dwork, Hardt, Pitassi, Reingold, and Zemel
2012	Data preprocessing by reweighing for fairness	Kamiran and Calders
2016	Equalized odds and equality of opportunity	Hardt, Price, and Srebro
2016	Impossibility results for simultaneous calibration and error rate balance	Kleinberg, Mullainathan, and Raghavan; Chouldechova
2017	Decision-boundary fairness constraints for classifiers	Zafar, Valera, Gomez-Rodriguez, and Gummadi
2017	Counterfactual fairness via causal models	Kusner, Loftus, Russell, and Silva
2018	Exponentiated gradient reductions approach to fair classification	Agarwal, Beygelzimer, Dudik, Langford, and Wallach
2018	Adversarial debiasing	Zhang, Lemoine, and Mitchell
2018	Release of IBM AI Fairness 360 toolkit	Bellamy et al. (IBM Research)
2019	Proxy-Lagrangian method for non-differentiable fairness constraints	Cotter, Jiang, Gupta, et al.
2020	Release of Microsoft Fairlearn toolkit	Bird, Dudik, Edgar, et al.
2023	FairGBM for gradient boosting with fairness constraints	Cruz, Belem, Jesus, et al.

References

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012). "Fairness Through Awareness." *Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS)*. ACM.
Hardt, M., Price, E., and Srebro, N. (2016). "Equality of Opportunity in Supervised Learning." *Advances in Neural Information Processing Systems (NeurIPS) 29*, pp. 3315-3323.
Kleinberg, J., Mullainathan, S., and Raghavan, M. (2016). "Inherent Trade-Offs in the Fair Determination of Risk Scores." *Proceedings of the 8th Innovations in Theoretical Computer Science Conference (ITCS)*.
Chouldechova, A. (2017). "Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments." *Big Data*, 5(2), pp. 153-163.
Zafar, M. B., Valera, I., Gomez-Rodriguez, M., and Gummadi, K. P. (2017). "Fairness Constraints: Mechanisms for Fair Classification." *Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS)*, PMLR 54, pp. 962-970.
Kusner, M. J., Loftus, J., Russell, C., and Silva, R. (2017). "Counterfactual Fairness." *Advances in Neural Information Processing Systems (NeurIPS) 30*.
Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., and Wallach, H. (2018). "A Reductions Approach to Fair Classification." *Proceedings of the 35th International Conference on Machine Learning (ICML)*, PMLR 80, pp. 60-69.
Zhang, B. H., Lemoine, B., and Mitchell, M. (2018). "Mitigating Unwanted Biases with Adversarial Learning." *Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES)*, pp. 335-340.
Bellamy, R. K. E., Dey, K., Hind, M., et al. (2019). "AI Fairness 360: An Extensible Toolkit for Detecting and Mitigating Algorithmic Bias." *IBM Journal of Research and Development*, 63(4/5), pp. 4:1-4:15.
Cotter, A., Jiang, H., Gupta, M., Wang, S., Narayan, T., You, S., and Sridharan, K. (2019). "Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals." *Journal of Machine Learning Research*, 20(172), pp. 1-59.
Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan, S. (2019). "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations." *Science*, 366(6464), pp. 447-453.
Bird, S., Dudik, M., Edgar, R., et al. (2020). "Fairlearn: A Toolkit for Assessing and Improving Fairness in AI." Microsoft Research Technical Report MSR-TR-2020-32.
Cruz, A. F., Belem, C., Jesus, S., Bravo, J., Saleiro, P., and Bizarro, P. (2023). "FairGBM: Gradient Boosting with Fairness Constraints." *Proceedings of the Eleventh International Conference on Learning Representations (ICLR)*.
Kamiran, F. and Calders, T. (2012). "Data Preprocessing Techniques for Classification without Discrimination." *Knowledge and Information Systems*, 33(1), pp. 1-33.

Explain like I'm 5 (ELI5)

Background and motivation

Fairness definitions used as constraints

Group fairness versus individual fairness

Impossibility results

Methods for enforcing fairness constraints

Pre-processing methods

In-processing methods

Regularization-based approaches

Constrained optimization approaches

The exponentiated gradient method

The proxy-Lagrangian method

Adversarial debiasing

Post-processing methods

Comparison of enforcement approaches

Mathematical formulation

The constrained learning problem

Lagrangian relaxation

Convergence and feasibility guarantees

Fairness constraints in gradient boosting

Software toolkits

Real-world applications

Criminal justice

Lending and credit

Hiring and recruitment

Healthcare

The fairness-accuracy trade-off

Limitations and open challenges

Historical timeline

See also

References

Improve this article

Related Articles

ARC-AGI 2

Disparate Impact

Counterfactual Fairness

Demographic Parity

Equalized Odds

Fairness Metric

Explain like I'm 5 (ELI5)

Background and motivation

Fairness definitions used as constraints

Group fairness versus individual fairness

Impossibility results

Methods for enforcing fairness constraints

Pre-processing methods

In-processing methods

Regularization-based approaches

Constrained optimization approaches

The exponentiated gradient method

The proxy-Lagrangian method

Adversarial debiasing

Post-processing methods

Comparison of enforcement approaches

Mathematical formulation

The constrained learning problem

Lagrangian relaxation

Convergence and feasibility guarantees

Fairness constraints in gradient boosting

Software toolkits

Real-world applications

Criminal justice

Lending and credit

Hiring and recruitment

Healthcare

The fairness-accuracy trade-off

Limitations and open challenges

Historical timeline

See also

References

Related Articles

ARC-AGI 2

Disparate Impact

Counterfactual Fairness

Demographic Parity

Equalized Odds

Fairness Metric