# Algorithmic fairness

> Source: https://aiwiki.ai/wiki/algorithmic_fairness
> Updated: 2026-06-21
> Categories: AI Ethics, AI Safety, Machine Learning
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**Algorithmic fairness** is the study of how automated decision systems can be made to produce decisions that are equitable across [protected attributes](/wiki/protected_attributes) such as race, gender, age, religion, and disability. Also called fairness in machine learning, ML fairness, or [AI fairness](/wiki/ai_fairness), it is a subfield of [AI ethics](/wiki/ai_ethics) that combines computer science, law, philosophy, statistics, and the social sciences. Researchers develop mathematical definitions of fairness, audit deployed systems for [algorithmic bias](/wiki/algorithmic_bias), design mitigation techniques, and analyze the social and legal context in which decisions operate [1][2].

The field's defining insight is that fairness cannot be reduced to a single number: a foundational 2016 result proved that several intuitive fairness criteria are mathematically incompatible whenever base rates differ across groups, so every deployed system embeds a value-laden choice about which errors are acceptable for which people [22][18]. The field grew rapidly after mid-2010s audits exposed disparities in tools used for criminal sentencing, hiring, lending, healthcare, and face analysis. In the 2016 ProPublica investigation of the COMPAS recidivism tool, the false-positive rate for Black defendants in Broward County was 44.9 percent versus 23.5 percent for white defendants, while the 2018 Gender Shades audit found commercial gender-classification error rates as high as 34.7 percent for darker-skinned women against 0.8 percent for lighter-skinned men [6][9]. By the early 2020s, fairness had become a recognized requirement in major regulatory frameworks including the EU AI Act and NYC Local Law 144 [3][4]. Many central questions, including which fairness definition to apply, whether sensitive attributes should be collected, and how to weigh trade-offs against accuracy, remain contested.

## What problems motivated the field?

The legal roots of algorithmic fairness predate machine learning. In the United States, the doctrine of [disparate impact](/wiki/disparate_impact) was articulated in *Griggs v. Duke Power Co.* (1971), which held that facially neutral employment practices that disproportionately exclude protected groups can be unlawful unless justified by business necessity [5]. Title VII of the Civil Rights Act of 1964, the Equal Credit Opportunity Act of 1974, and the Fair Housing Act of 1968 form the backbone of US anti-discrimination law. In the EU, the Racial Equality Directive (2000/43/EC), the Employment Equality Directive (2000/78/EC), and Article 21 of the Charter of Fundamental Rights cover similar ground.

The **[COMPAS](/wiki/compas_recidivism) controversy** began in May 2016 when [ProPublica](/wiki/propublica) published "Machine Bias" by Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. The investigation analyzed risk scores produced by Correctional Offender Management Profiling for Alternative Sanctions, sold by Northpointe (now Equivant) and used by judges in several US states, drawing on scores for more than 7,000 people arrested in Broward County, Florida, in 2013 and 2014. ProPublica reported that Black defendants who did not reoffend were almost twice as likely to be misclassified as high risk than white non-reoffenders (a false-positive rate of 44.9 percent versus 23.5 percent), while white reoffenders were more often misclassified as low risk [6]. [Northpointe](/wiki/northpointe) responded that COMPAS was calibrated within groups: a given score corresponded to the same probability of reoffense for Black and white defendants, and that the tool correctly predicted recidivism about 61 percent of the time for both groups [7]. The dispute was later shown to reflect a mathematical impossibility rather than an error on either side.

The **Amazon recruiting tool** was an internal experiment to score job applicants. Reuters reported in October 2018 that Amazon scrapped the tool around 2017 after engineers discovered it penalized resumes containing the word "women's" and downgraded graduates of two all-women's colleges. The model had been trained on a decade of resumes submitted to Amazon, predominantly from men [8].

**[Gender Shades](/wiki/gender_shades)**, a 2018 audit by [Joy Buolamwini](/wiki/joy_buolamwini) and [Timnit Gebru](/wiki/timnit_gebru) presented at FAccT, evaluated commercial gender classification systems from IBM, Microsoft, and Face++ on a curated set of 1,270 face images. Error rates were 0.8 percent or lower for lighter-skinned men but as high as 34.7 percent for darker-skinned women in one system [9]. The study became a touchstone for intersectional analysis, and IBM exited the face recognition business in 2020.

In October 2019, **Obermeyer, Powers, Vogeli, and Mullainathan** published a study in *Science* showing that a widely used commercial healthcare risk algorithm assigned similar risk scores to Black and white patients with different levels of underlying illness. The system used past healthcare costs as a proxy for need; because Black patients historically received less care, costs were a biased proxy. Correcting it would more than double the proportion of Black patients identified for high-risk care management, from 17.7 percent to 46.5 percent [10].

The **Apple Card credit limit** dispute began in November 2019 when entrepreneur David Heinemeier Hansson posted that he had received a credit limit roughly 20 times higher than his wife despite shared finances. The New York State Department of Financial Services investigated Goldman Sachs and concluded in March 2021 that no violation of New York fair lending law had been substantiated, while noting the case raised questions about explainability [11]. The **Twitter image cropping** algorithm came under scrutiny in 2020 when users observed that the saliency model appeared to favor lighter-skinned and female faces; Twitter audited internally and changed the product to display full images on mobile timelines in 2021 [12]. The **NIST Face Recognition Vendor Test demographic effects study** (December 2019) evaluated 189 algorithms from 99 developers and reported many systems showing higher false match rates for African and East Asian faces relative to Eastern European faces [13].

## Where does algorithmic bias come from?

Suresh and Guttag (2021) and Mehrabi et al. (2021) provide widely cited taxonomies [14][2].

**Historical bias** arises when training data accurately reflects past discrimination. A model predicting who has historically been promoted mirrors past managerial preferences, including unlawful patterns. **Representation bias** occurs when the training population underrepresents some groups; ImageNet and IJB-A were criticized for skewing toward lighter-skinned subjects, and the Pilot Parliaments Benchmark used in [Gender Shades](/wiki/gender_shades) was constructed to balance representation [9]. **Measurement bias** appears when features or labels are differential proxies for the construct of interest. Using arrest rates as a proxy for crime encodes disparities in policing; using healthcare cost as a proxy for need encodes disparities in access, as in Obermeyer et al. [10]. **Aggregation bias** results from fitting one model to a heterogeneous population. **Evaluation bias** occurs when benchmarks do not represent the deployment population; **deployment bias** occurs when a model is used in a context for which it was not designed. **Feedback loops** arise when predictions influence future training data. Lum and Isaac (2016) showed that predictive policing tools directing patrols to historically over-policed neighborhoods can generate more arrests there, reinforcing the pattern when the model is retrained [15].

## How is fairness defined mathematically?

Using notation A for a sensitive attribute, Y for the true outcome, hat-Y for a binary prediction, and S for a continuous score, researchers have proposed many statistical fairness criteria.

| Criterion | Statistical statement | Intuition | Key reference |
|-----------|----------------------|-----------|---------------|
| [Demographic parity](/wiki/demographic_parity) (statistical parity) | P(hat-Y=1 \| A=a) = P(hat-Y=1 \| A=b) | Each group receives positive predictions at the same rate | Dwork et al. 2012 [1] |
| Conditional demographic parity | P(hat-Y=1 \| A=a, L=l) = P(hat-Y=1 \| A=b, L=l) | Equal positive rates after conditioning on legitimate features L | Kamiran et al. 2013 [16] |
| [Equalized odds](/wiki/equalized_odds) | P(hat-Y=1 \| Y=y, A=a) = P(hat-Y=1 \| Y=y, A=b) for y in {0,1} | Equal true positive and false positive rates across groups | Hardt, Price, Srebro 2016 [17] |
| [Equality of opportunity](/wiki/equality_of_opportunity) | P(hat-Y=1 \| Y=1, A=a) = P(hat-Y=1 \| Y=1, A=b) | Among true positives, equal selection rates across groups | Hardt, Price, Srebro 2016 [17] |
| Predictive parity ([calibration within groups](/wiki/calibration_within_groups)) | P(Y=1 \| S=s, A=a) = P(Y=1 \| S=s, A=b) | Score s means the same probability regardless of group | Chouldechova 2017 [18] |
| [Counterfactual fairness](/wiki/counterfactual_fairness) | hat-Y(A=a, U) = hat-Y(A=b, U) | Prediction is unchanged in a counterfactual world where the sensitive attribute is altered | Kusner et al. 2017 [19] |
| [Individual fairness](/wiki/individual_fairness) | If d(x, x') is small, then \|hat-Y(x) - hat-Y(x')\| is small | Similar individuals receive similar predictions, given a task-specific metric | Dwork et al. 2012 [1] |
| Treatment equality | FN_a / FP_a = FN_b / FP_b | Ratio of false negatives to false positives is equal across groups | Berk et al. 2018 [20] |

These criteria fall into broad families. [Independence](/wiki/fairness_metric) criteria such as demographic parity require the prediction to be statistically independent of the protected attribute. Separation criteria such as equalized odds require independence conditional on the true outcome. Sufficiency criteria such as calibration require independence of the true outcome from the attribute given the score. Demographic parity is among the oldest formalizations and roughly corresponds to disparate impact. It is sometimes operationalized using the four-fifths rule from US EEOC guidance, which treats a selection rate ratio below 0.8 as evidence of adverse impact. Critics note it can require accepting weaker candidates in one group purely to equalize rates.

Equalized odds and equality of opportunity, introduced by Hardt, Price, and Srebro in 2016, condition on the true outcome and so do not require equal positive rates when base rates differ [17]. Calibration within groups was the criterion satisfied by COMPAS as analyzed by Northpointe and by Corbett-Davies, Pierson, Feller, Goel, and Huq (2017): a given score corresponds to the same risk regardless of group [18][21]. Counterfactual fairness, proposed by Kusner, Loftus, Russell, and Silva in 2017, formalizes fairness using structural causal models [19]. [Individual fairness](/wiki/individual_fairness), introduced by Dwork et al. in 2012, requires that similar individuals receive similar predictions; the challenge is choosing the similarity metric, which encodes much of the normative content of the judgment [1].

## Why can fairness criteria not all be satisfied at once?

A central result is that several intuitive fairness criteria cannot be satisfied simultaneously when base rates differ across groups. This was shown independently by Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan in their 2016 paper "Inherent Trade-Offs in the Fair Determination of Risk Scores" and by Alexandra Chouldechova in her 2017 paper "Fair Prediction with Disparate Impact" [22][18].

Kleinberg, Mullainathan, and Raghavan formalized three properties: calibration within groups, balance for the positive class (equal average score among Y=1 in each group), and balance for the negative class (equal average score among Y=0 in each group). They proved these conditions cannot all hold unless the predictor is perfect or base rates are equal across groups. Chouldechova showed an analogous result for binary classifiers, stating in her abstract that "the criteria cannot all be simultaneously satisfied when recidivism prevalence differs across groups": predictive parity and equal false positive and false negative rates cannot all hold at once outside the trivial cases [18].

The COMPAS dispute is the canonical illustration. Black and white defendants in Broward County had different observed two-year recidivism rates, so no non-perfect classifier could satisfy both the calibration property emphasized by Northpointe and the equal error-rate property emphasized by ProPublica. Both critiques were mathematically correct; the disagreement was about which criterion mattered more, not about the data. Pleiss et al. (2017) extended the result by showing calibration is generally incompatible with even relaxed equalized-odds constraints [23]. Fairness cannot be reduced to a single metric: every deployed system embeds a choice about which errors are acceptable for which groups, and that choice has moral and legal content mathematics alone cannot settle.

## How can unfairness be mitigated?

Methods to reduce unfairness are typically grouped by where in the pipeline they intervene.

| Stage | Method | Mechanism | Reference |
|-------|--------|-----------|-----------|
| Pre-processing | Reweighting | Weight examples so group-conditional prevalence matches | Kamiran & Calders 2012 [24] |
| Pre-processing | Disparate impact remover | Modify features so they are not predictive of group | Feldman et al. 2015 [25] |
| Pre-processing | Fair representation learning | Learn embedding obscuring protected attribute while preserving task signal | Zemel et al. 2013 [26] |
| Pre-processing | Optimized preprocessing | Transform features and labels under fairness and utility constraints | Calmon et al. 2017 [27] |
| In-processing | Constrained optimization | Train under explicit fairness constraints | Zafar et al. 2017 [28] |
| In-processing | Adversarial debiasing | Train predictor jointly with adversary recovering sensitive attribute | Zhang et al. 2018 [29] |
| In-processing | Fair regularization | Add regularization penalizing disparity | Kamishima et al. 2012 [30] |
| In-processing | Reductions | Reduce fair classification to cost-sensitive problems | Agarwal et al. 2018 [31] |
| Post-processing | Group-specific thresholds | Choose different decision thresholds per group | Hardt et al. 2016 [17] |
| Post-processing | Reject option | Reassign predictions near boundary in favor of disadvantaged group | Kamiran et al. 2012 [32] |
| Post-processing | Calibrated equalized odds | Trade calibration for equalized odds via randomized post-processing | Pleiss et al. 2017 [23] |

Pre-processing methods modify training data so downstream learners are less likely to encode unfair patterns; they are model-agnostic but may distort genuinely predictive signal. In-processing methods build fairness into training itself; adversarial debiasing treats fairness as a min-max problem [29], and the reductions approach reduces fair classification to a sequence of weighted classification problems with theoretical guarantees [31]. Post-processing methods adjust an already trained predictor; they can satisfy specific criteria exactly but typically require access to the sensitive attribute at decision time. Friedler et al. (2019) reported that mitigation effects depend strongly on dataset, base rate, and choice of fairness metric [33].

## What tools and benchmarks does the field use?

Several open-source toolkits implement fairness metrics and mitigation methods. [AIF360](/wiki/aif360) was released by IBM in 2018 with Python and R interfaces to dozens of metrics and mitigation algorithms [34]. [Fairlearn](/wiki/fairlearn) was released by Microsoft in 2018 and is now an independent project; it implements the reductions approach, post-processing including ThresholdOptimizer, and a trade-off dashboard [35]. The What-If Tool from Google PAIR (2018) integrates with TensorBoard. Aequitas, from CMU's Center for Data Science and Public Policy, targets policymakers and journalists.

Common benchmarks include the UCI Adult Income dataset (a 1994 census extract), the COMPAS dataset released by ProPublica, German Credit, MIMIC-III for clinical applications, the Pilot Parliaments Benchmark from Gender Shades, and CelebA. Critiques of Adult by Ding, Hardt, Miller, and Schmidt (2021) led to Folktables, drawn from the American Community Survey [36].

## How is algorithmic fairness regulated?

Governments have begun translating fairness research into binding rules.

The **[EU AI Act](/wiki/eu_ai_act)** was politically agreed in December 2023 and entered into force on August 1, 2024. Provisions phase in over the following years, with prohibitions applying after six months and high-risk system requirements applying twenty-four months after entry. The Act categorizes systems by risk and imposes obligations on high-risk systems, including those in employment, education, credit scoring, law enforcement, and the administration of justice. High-risk systems must undergo conformity assessment addressing data quality, documentation, transparency, oversight, accuracy, robustness, and bias. Article 10 requires training, validation, and testing datasets be examined for biases that may lead to discrimination [3]. The **EU General Data Protection Regulation** ([GDPR](/wiki/gdpr)), in force since May 2018, regulates automated decision-making in Article 22, granting data subjects a right against decisions based solely on automated processing with legal or similarly significant effects; Recital 71 mentions preventing discriminatory effects [37].

In the US, the **Equal Employment Opportunity Commission** issued guidance in May 2022 and 2023 explaining how the Americans with Disabilities Act and Title VII apply to employer use of AI tools, including testing for adverse impact [38]. **[NYC Local Law 144](/wiki/nyc_local_law_144)** took effect on January 1, 2023, with enforcement from July 5, 2023; it requires employers using automated employment decision tools for New York City residents to commission an annual independent bias audit, publish a summary, and notify candidates [4]. The **Federal Trade Commission** issued a 2021 business guidance post warning that biased algorithms in consumer decisions can violate the Equal Credit Opportunity Act and Section 5 of the FTC Act [39]. The **NIST** [AI Risk Management Framework](/wiki/nist_ai_rmf) (AI RMF 1.0) released January 26, 2023 is voluntary but widely referenced and includes fairness among the characteristics of trustworthy AI [40]. The **Algorithmic Accountability Act** has been proposed since 2019 but as of early 2026 has not been enacted. **California AB 2930** was introduced in 2024 to regulate automated decision tools, and Colorado's SB 24-205, enacted in May 2024, applies developer and deployer obligations to high-risk AI systems from February 2026.

## What are the main critiques and debates?

In "Fairness and Abstraction in Sociotechnical Systems" (2019), Andrew Selbst, danah boyd, Sorelle Friedler, Suresh Venkatasubramanian, and Janet Vertesi argue that fair machine learning research often abstracts away from the social context in which a system is embedded, treating fairness as a property of an algorithm rather than of a sociotechnical assemblage including data collection, deployment, oversight, and contestation [41]. They identify "abstraction traps," including framing, portability, and solutionism traps.

A related debate concerns process-based versus outcome-based fairness. Process-based accounts emphasize equal treatment, transparency, and the right to explanation; outcome-based accounts measure disparities in results. The two can come apart: a process that treats individuals identically can still produce uneven outcomes if relevant features are unequally distributed. The distinction between **disparate treatment** and disparate impact is closely related. US anti-discrimination law generally permits disparate impact analysis but treats explicit use of protected attributes with greater suspicion. Mitigation techniques that adjust thresholds by group raise questions under the disparate treatment doctrine, especially after *Students for Fair Admissions v. Harvard* (2023).

**Trade-offs with accuracy** are contested: some papers argue fairness and accuracy can improve together when unfairness stems from poor data quality, while others find a clearer trade-off when criteria are imposed strictly across multiple groups. **Fairness versus privacy** is a structural tension: measuring fairness across protected attributes typically requires collecting them, which conflicts with privacy norms and laws limiting data collection. Methods using [proxy sensitive attributes](/wiki/proxy_sensitive_attributes) raise their own concerns. Andrus, Spitzer, Brown, and Xiang (2021) document the challenges firms face when they cannot or will not collect race and gender data [42].

**Intersectionality** is a sustained concern. Many analyses report metrics by single attributes and miss patterns affecting specific subgroups; Gender Shades centered intersectionality by reporting error rates separately for darker-skinned women, lighter-skinned women, darker-skinned men, and lighter-skinned men [9]. **Long-term effects** of fairness interventions are a newer area. Liu, Dean, Rolf, Simchowitz, and Hardt (2018) showed that criteria optimized at one point in time can worsen disparities for the disadvantaged group when decisions affect future qualifications [43]. Mouzannar, Ohannessian, and Srebro (2019) reported similar dynamics in education and lending [44]. Hu and Kohler-Hausmann argue that race in particular cannot be cleanly separated from other features for causal counterfactuals [45].

## Who shaped the field, and where is it published?

Researchers who have shaped the field include [Cynthia Dwork](/wiki/cynthia_dwork), [Moritz Hardt](/wiki/moritz_hardt), [Solon Barocas](/wiki/solon_barocas), [Arvind Narayanan](/wiki/arvind_narayanan), Toniann Pitassi, Omer Reingold, Richard Zemel, [Joy Buolamwini](/wiki/joy_buolamwini), [Timnit Gebru](/wiki/timnit_gebru), Margaret Mitchell, Inioluwa Deborah Raji, Sandra Wachter, Chris Russell, Alexandra Chouldechova, [Jon Kleinberg](/wiki/jon_kleinberg), [Sendhil Mullainathan](/wiki/sendhil_mullainathan), Manish Raghavan, Sorelle Friedler, Suresh Venkatasubramanian, danah boyd, Andrew Selbst, Alex Hanna, Hanna Wallach, Aaron Roth, and Lily Hu.

The principal venue is the **[ACM Conference on Fairness, Accountability, and Transparency](/wiki/facct)** (FAccT), founded in 2018 as FAT*. The **[AAAI/ACM Conference on AI, Ethics, and Society](/wiki/aies)** (AIES) covers similar ground, and NeurIPS, ICML, and ICLR also publish fairness research. Research and advocacy centers include the **Algorithmic Justice League** founded by Buolamwini, the **Distributed AI Research Institute** ([DAIR](/wiki/dair_institute)) founded by Gebru in 2021, the **[AI Now Institute](/wiki/ai_now_institute)** at NYU, **Stanford HAI**, and the **Center for Human-Compatible AI** at Berkeley. The textbook *Fairness and Machine Learning* by Barocas, Hardt, and Narayanan, free at fairmlbook.org since 2018 and in print from MIT Press in 2023, is a standard reference [46].

## How does fairness apply to large language models?

From roughly 2022 onward, fairness research has expanded to cover foundation models and generative systems alongside earlier work on tabular and image classification.

**[Large language model](/wiki/large_language_model) fairness** is now a significant subfield. The **[BBQ benchmark](/wiki/bbq_benchmark)** (Bias Benchmark for Question Answering), released by Parrish et al. in 2022, contains over 58,000 question-answer pairs covering age, disability, gender identity, nationality, physical appearance, race, religion, socioeconomic status, and sexual orientation [47]. The **[BOLD benchmark](/wiki/bold_benchmark)** (Bias in Open-Ended Language Generation Dataset), released by Dhamala and colleagues at Amazon in 2021, contains roughly 23,000 prompts evaluating bias across profession, gender, race, religion, and political ideology [48]. StereoSet, CrowS-Pairs, RealToxicityPrompts, and HELM cover related ground.

Text-to-image models have been audited for racial and gender skew. Bianchi et al. (2023) showed that [Stable Diffusion](/wiki/stable_diffusion) amplified demographic stereotypes for prompts about professions and adjectives [49]. Bloomberg Graphics published a 2023 visual analysis reaching similar conclusions across DALL-E, Midjourney, and Stable Diffusion [50]. **Algorithmic auditing** has matured as a practice: Raji et al. (2020) proposed a framework for end-to-end internal audits, and external audits are commissioned under regimes such as NYC Local Law 144 [51]. Algorithmic fairness is not solely a technical problem and cannot be resolved by technical means alone.

## See also

[AI ethics](/wiki/ai_ethics), [algorithmic bias](/wiki/algorithmic_bias), [COMPAS recidivism](/wiki/compas_recidivism), [Gender Shades](/wiki/gender_shades), [equalized odds](/wiki/equalized_odds), [demographic parity](/wiki/demographic_parity), [calibration within groups](/wiki/calibration_within_groups), [counterfactual fairness](/wiki/counterfactual_fairness), [individual fairness](/wiki/individual_fairness), [fairness metric](/wiki/fairness_metric), [AIF360](/wiki/aif360), [Fairlearn](/wiki/fairlearn), [EU AI Act](/wiki/eu_ai_act), [GDPR](/wiki/gdpr), [NIST AI RMF](/wiki/nist_ai_rmf), [NYC Local Law 144](/wiki/nyc_local_law_144), [FAccT](/wiki/facct), [AIES](/wiki/aies), [DAIR Institute](/wiki/dair_institute), [AI Now Institute](/wiki/ai_now_institute), [BBQ benchmark](/wiki/bbq_benchmark), [BOLD benchmark](/wiki/bold_benchmark).

## References

1. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. *ITCS*. https://arxiv.org/abs/1104.3913
2. Mehrabi, N., et al. (2021). A survey on bias and fairness in machine learning. *ACM Computing Surveys* 54(6). https://arxiv.org/abs/1908.09635
3. EU. (2024). Regulation (EU) 2024/1689 (AI Act). https://eur-lex.europa.eu/eli/reg/2024/1689/oj
4. NYC DCWP. (2023). Automated Employment Decision Tools (Local Law 144). https://www.nyc.gov/site/dca/about/automated-employment-decision-tools.page
5. Griggs v. Duke Power Co., 401 U.S. 424 (1971). https://supreme.justia.com/cases/federal/us/401/424/
6. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias. *ProPublica*. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
7. Dieterich, W., Mendoza, C., & Brennan, T. (2016). COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Northpointe. https://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf
8. Dastin, J. (2018, October 10). Amazon scraps secret AI recruiting tool that showed bias against women. *Reuters*. https://www.reuters.com/article/world/insight-amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK0AG/
9. Buolamwini, J., & Gebru, T. (2018). Gender shades. *FAT* 2018*, 81, 77-91. https://proceedings.mlr.press/v81/buolamwini18a.html
10. Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. *Science* 366(6464), 447-453. https://www.science.org/doi/10.1126/science.aax2342
11. NY DFS. (2021, March 23). Report on Apple Card investigation. https://www.dfs.ny.gov/system/files/documents/2021/03/rpt_202103_apple_card_investigation.pdf
12. Yee, K., Tantipongpipat, U., & Mishra, S. (2021). Image cropping on Twitter. *PACM HCI* 5(CSCW2). https://arxiv.org/abs/2105.08667
13. Grother, P., Ngan, M., & Hanaoka, K. (2019). FRVT Part 3: Demographic effects (NISTIR 8280). NIST. https://nvlpubs.nist.gov/nistpubs/ir/2019/NIST.IR.8280.pdf
14. Suresh, H., & Guttag, J. (2021). A framework for understanding sources of harm throughout the ML life cycle. *EAAMO*. https://arxiv.org/abs/1901.10002
15. Lum, K., & Isaac, W. (2016). To predict and serve? *Significance* 13(5), 14-19. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2016.00960.x
16. Kamiran, F., Zliobaite, I., & Calders, T. (2013). Quantifying explainable discrimination. *KAIS* 35(3), 613-644. https://link.springer.com/article/10.1007/s10115-012-0584-8
17. Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. *NeurIPS*. https://arxiv.org/abs/1610.02413
18. Chouldechova, A. (2017). Fair prediction with disparate impact. *Big Data* 5(2), 153-163. https://arxiv.org/abs/1703.00056
19. Kusner, M. J., Loftus, J. R., Russell, C., & Silva, R. (2017). Counterfactual fairness. *NeurIPS*. https://arxiv.org/abs/1703.06856
20. Berk, R., Heidari, H., Jabbari, S., Kearns, M., & Roth, A. (2018). Fairness in criminal justice risk assessments. *Sociological Methods and Research* 50(1), 3-44. https://arxiv.org/abs/1703.09207
21. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. *KDD*. https://arxiv.org/abs/1701.08230
22. Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. *ITCS*. https://arxiv.org/abs/1609.05807
23. Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., & Weinberger, K. Q. (2017). On fairness and calibration. *NeurIPS*. https://arxiv.org/abs/1709.02012
24. Kamiran, F., & Calders, T. (2012). Data preprocessing techniques for classification without discrimination. *KAIS* 33(1), 1-33. https://link.springer.com/article/10.1007/s10115-011-0463-8
25. Feldman, M., et al. (2015). Certifying and removing disparate impact. *KDD*. https://arxiv.org/abs/1412.3756
26. Zemel, R., et al. (2013). Learning fair representations. *ICML*. https://proceedings.mlr.press/v28/zemel13.html
27. Calmon, F., et al. (2017). Optimized pre-processing for discrimination prevention. *NeurIPS*. https://papers.nips.cc/paper/2017/hash/9a49a25d845a483fae4be7e341368e36-Abstract.html
28. Zafar, M. B., et al. (2017). Fairness constraints. *AISTATS*. https://arxiv.org/abs/1507.05259
29. Zhang, B. H., Lemoine, B., & Mitchell, M. (2018). Mitigating unwanted biases with adversarial learning. *AIES*. https://arxiv.org/abs/1801.07593
30. Kamishima, T., et al. (2012). Fairness-aware classifier with prejudice remover regularizer. *ECML PKDD*. https://link.springer.com/chapter/10.1007/978-3-642-33486-3_3
31. Agarwal, A., et al. (2018). A reductions approach to fair classification. *ICML*. https://arxiv.org/abs/1803.02453
32. Kamiran, F., Karim, A., & Zhang, X. (2012). Decision theory for discrimination-aware classification. *ICDM*. https://ieeexplore.ieee.org/document/6413831
33. Friedler, S. A., et al. (2019). A comparative study of fairness-enhancing interventions. *FAT**. https://arxiv.org/abs/1802.04422
34. Bellamy, R. K. E., et al. (2018). AI Fairness 360. *IBM JRD* 63(4/5). https://arxiv.org/abs/1810.01943
35. Bird, S., et al. (2020). Fairlearn: A toolkit for assessing and improving fairness in AI (MSR-TR-2020-32). https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/
36. Ding, F., Hardt, M., Miller, J., & Schmidt, L. (2021). Retiring Adult. *NeurIPS*. https://arxiv.org/abs/2108.04884
37. EU. (2016). Regulation 2016/679 (GDPR), Article 22. https://gdpr-info.eu/art-22-gdpr/
38. US EEOC. (2023). Assessing adverse impact in software, algorithms, and AI under Title VII. https://www.eeoc.gov/laws/guidance/select-issues-assessing-adverse-impact-software-algorithms-and-artificial
39. Jillson, E. (2021, April 19). Aiming for truth, fairness, and equity in your company's use of AI. US FTC. https://www.ftc.gov/business-guidance/blog/2021/04/aiming-truth-fairness-equity-your-companys-use-ai
40. NIST. (2023, January 26). AI Risk Management Framework (AI RMF 1.0). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf
41. Selbst, A. D., et al. (2019). Fairness and abstraction in sociotechnical systems. *FAT**. https://dl.acm.org/doi/10.1145/3287560.3287598
42. Andrus, M., Spitzer, E., Brown, J., & Xiang, A. (2021). What we can't measure, we can't understand. *FAccT*. https://arxiv.org/abs/2011.02282
43. Liu, L. T., Dean, S., Rolf, E., Simchowitz, M., & Hardt, M. (2018). Delayed impact of fair machine learning. *ICML*. https://arxiv.org/abs/1803.04383
44. Mouzannar, H., Ohannessian, M. I., & Srebro, N. (2019). From fair decision making to social equality. *FAT**. https://arxiv.org/abs/1812.02952
45. Hu, L., & Kohler-Hausmann, I. (2020). What's sex got to do with machine learning? *FAT**. https://dl.acm.org/doi/10.1145/3351095.3375674
46. Barocas, S., Hardt, M., & Narayanan, A. (2023). *Fairness and Machine Learning*. MIT Press. https://fairmlbook.org/
47. Parrish, A., et al. (2022). BBQ: A hand-built bias benchmark. *Findings of ACL*. https://arxiv.org/abs/2110.08193
48. Dhamala, J., et al. (2021). BOLD: Dataset and metrics for measuring biases in open-ended language generation. *FAccT*. https://arxiv.org/abs/2101.11718
49. Bianchi, F., et al. (2023). Easily accessible text-to-image generation amplifies demographic stereotypes. *FAccT*. https://arxiv.org/abs/2211.03759
50. Nicoletti, L., & Bass, D. (2023, June 9). Humans are biased. Generative AI is even worse. *Bloomberg*. https://www.bloomberg.com/graphics/2023-generative-ai-bias/
51. Raji, I. D., et al. (2020). Closing the AI accountability gap. *FAT**. https://arxiv.org/abs/2001.00973
