Disparate treatment in machine learning refers to the practice of treating individuals differently based on protected attributes such as race, gender, age, religion, national origin, or disability within algorithmic decision-making systems. It is the computational analog of intentional discrimination in civil rights law. Unlike disparate impact, which addresses neutral policies that produce unequal outcomes, disparate treatment involves decisions that explicitly or implicitly rely on protected characteristics to differentiate between individuals.
As automated systems increasingly govern hiring, lending, criminal sentencing, healthcare allocation, and housing, the concept of disparate treatment has become central to the study of AI fairness. Understanding how machine learning models can perpetuate or introduce intentional discrimination is essential for building equitable AI systems and for ensuring compliance with anti-discrimination law.
Imagine you are picking kids for a soccer team. If you pick players based on how fast they run and how well they kick, that is fair. But if you say "I don't want any kids with red hair on my team" and leave them out just because of their hair color, that is disparate treatment. You are treating some kids differently for a reason that has nothing to do with soccer.
Computers can do the same thing. If a computer program is deciding who gets a job or a loan, and it secretly looks at whether someone is a man or a woman (or uses clues that tell it someone's race), and then gives worse results to one group, that is disparate treatment. The computer is making unfair decisions based on who people are, not on what they can do.
Disparate treatment is a legal doctrine rooted in United States anti-discrimination law, particularly Title VII of the Civil Rights Act of 1964. It has since been extended through additional statutes and applied to algorithmic decision-making contexts.
| Statute | Year | Scope | Protected characteristics |
|---|---|---|---|
| Civil Rights Act, Title VII | 1964 | Employment | Race, color, religion, sex, national origin |
| Age Discrimination in Employment Act (ADEA) | 1967 | Employment | Age (40 and older) |
| Fair Housing Act (FHA) | 1968 | Housing, lending | Race, color, religion, sex, national origin, familial status, disability |
| Equal Credit Opportunity Act (ECOA) | 1974 | Credit decisions | Race, color, religion, national origin, sex, marital status, age, public assistance status |
| Americans with Disabilities Act (ADA) | 1990 | Employment, public services | Disability |
| Genetic Information Nondiscrimination Act (GINA) | 2008 | Employment, health insurance | Genetic information |
The legal standard for proving disparate treatment in cases lacking direct evidence of discrimination was established by the U.S. Supreme Court in McDonnell Douglas Corp. v. Green (1973). The framework consists of three steps:
Prima facie case. The plaintiff must demonstrate that (a) they belong to a protected class, (b) they were qualified for the position or benefit, (c) they suffered an adverse action, and (d) the adverse action occurred under circumstances suggesting discrimination. The burden at this stage is minimal, often described as "de minimis."
Employer's rebuttal. Once the prima facie case is established, the burden shifts to the defendant to articulate a legitimate, nondiscriminatory reason for the adverse action.
Proving pretext. If the defendant provides such a reason, the burden returns to the plaintiff to show that the stated reason is a pretext for discrimination, meaning the true motivation was discriminatory intent.
This framework was designed for human decision-makers, and its application to algorithmic systems presents significant challenges. Machines do not possess intent in the human sense; they execute instructions encoded by their designers. As a result, courts have had to adapt their analysis when evaluating whether an AI system engaged in disparate treatment.
Disparate treatment and disparate impact are the two primary legal theories for addressing discrimination, including in AI contexts. They differ in what must be proven and how liability is established.
| Dimension | Disparate treatment | Disparate impact |
|---|---|---|
| Core concept | Intentional discrimination based on protected attributes | Neutral practice that disproportionately harms a protected group |
| Intent required | Yes | No |
| Key question | Did the decision-maker treat someone differently because of a protected characteristic? | Does the practice produce significantly different outcomes for a protected group? |
| Burden of proof | Plaintiff must show discriminatory intent | Plaintiff must show statistical disparity; defendant must justify business necessity |
| AI application | Algorithm explicitly uses or infers protected attributes | Algorithm uses neutral features that correlate with protected attributes |
| Example | Loan algorithm penalizes applicants from certain zip codes known to correlate with race | Resume screener trained on historically biased data rejects more female applicants |
| Legal standard | McDonnell Douglas burden-shifting | Griggs v. Duke Power (1971) three-part test |
In the context of AI, disparate treatment is often harder to prove than disparate impact because algorithmic systems lack human intent. However, disparate treatment can still be established when a system explicitly incorporates protected attributes, when developers knowingly deploy a discriminatory system, or when proxy variables serve as stand-ins for protected characteristics.
Disparate treatment in ML systems can emerge through several mechanisms, ranging from explicit use of protected attributes to more subtle forms of proxy discrimination.
The most straightforward form of disparate treatment occurs when a model directly uses protected attributes (race, gender, age) as input features. For example, a classification model that takes "gender" as an input variable and assigns different scores to male and female applicants engages in disparate treatment by design. While this form is relatively easy to detect and prevent, it still occurs in practice, particularly in legacy systems or when developers are unaware of legal requirements.
Even when protected attributes are excluded from a model's input features, machine learning algorithms can learn to infer them through correlated variables known as proxy variables. Common proxies include:
Proxy discrimination is especially challenging because removing the most obvious proxies does not solve the problem. Machine learning models, particularly deep learning systems, can discover non-linear combinations of seemingly neutral features that reconstruct protected attributes with high accuracy. Simply denying a model access to intuitive proxies causes it to locate less intuitive ones.
When training data reflects historical patterns of discrimination, models learn to replicate those patterns. If a hiring model is trained on a dataset of past hiring decisions where certain demographic groups were systematically disadvantaged, the model will learn to associate characteristics of those groups with negative outcomes. This can constitute disparate treatment when the model effectively learns to use protected attributes (directly or through proxies) as a basis for its predictions.
Large pretrained models such as language models and vision models can encode societal stereotypes from their training corpora. When these models are used as components in downstream decision-making systems (for example, as embedding layers in a resume screening tool), the encoded biases can propagate into final decisions. Research has shown that word embeddings trained on large text corpora associate certain professions with specific genders and associate negative attributes with particular racial groups.
Disparate treatment can be amplified through feedback loops. If a biased model denies opportunities to members of a protected group, the resulting data (showing worse outcomes for that group) reinforces the model's discriminatory patterns in subsequent retraining cycles. Over time, the bias compounds, creating a self-reinforcing cycle of discrimination.
Several high-profile cases have demonstrated how disparate treatment manifests in deployed AI systems.
In 2014, Amazon developed an automated resume screening system to evaluate job applicants. The system was trained on resumes submitted to the company over the preceding decade. Because Amazon's engineering workforce was predominantly male, the model learned to penalize resumes containing indicators of female applicants. Specifically, the algorithm downgraded resumes that included the word "women's" (as in "women's chess club captain") and penalized graduates of certain all-women's colleges. It also favored verbs like "executed" and "captured," which appeared more frequently on male engineers' resumes. Amazon attempted to correct the bias but ultimately concluded that the system could not be prevented from finding new proxies for gender, and the project was abandoned in 2017.
In June 2022, the U.S. Department of Justice reached a settlement with Meta Platforms (formerly Facebook) over allegations that Meta's housing ad delivery system violated the Fair Housing Act. The DOJ alleged that Meta's machine learning algorithms used protected characteristics, including race, color, religion, sex, disability, familial status, and national origin, to determine which subset of an advertiser's audience would actually receive housing advertisements. Under the settlement, Meta was required to pay the maximum penalty of $115,054 under the FHA and to develop a new ad delivery system that did not rely on protected characteristics. An independent third-party reviewer was appointed to verify compliance.
The Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) system, developed by Northpointe (now Equivant), is a risk assessment tool used in 46 U.S. states to predict whether criminal defendants are likely to reoffend. In 2016, ProPublica analyzed over 10,000 criminal defendants in Broward County, Florida, and found that Black defendants were 77% more likely to be falsely labeled as high risk for violent recidivism and 45% more likely to be incorrectly flagged for any future crime. White defendants, conversely, were more likely to be falsely labeled as low risk.
Northpointe countered that COMPAS was calibrated: the overall accuracy rate was approximately 60% for both Black and white defendants. This dispute highlighted a fundamental tension in algorithmic fairness, often called the "impossibility of fairness" theorem: calibration, equal false positive rates, and equal false negative rates cannot all be satisfied simultaneously when base rates differ between groups.
A 2019 study published in Science by Obermeyer et al. revealed that a healthcare risk prediction algorithm developed by Optum and used by hospital systems across the United States systematically underestimated the healthcare needs of Black patients. The algorithm used healthcare costs as a proxy for health needs, but because Black patients historically spent less on healthcare (due to unequal access, not better health), the system assigned them lower risk scores. The study estimated that this bias affected approximately 200 million people annually. After the study, a revised algorithm that incorporated direct health predictions alongside cost data reduced the bias by 84%.
In July 2024, a federal court in Mobley v. Workday ruled that AI vendors can be held directly liable for employment discrimination under an agency theory. The plaintiff alleged that Workday's AI-powered applicant screening tools discriminated on the basis of race, age, and disability. The court dismissed the disparate treatment claim for insufficient evidence of discriminatory intent but allowed disparate impact claims under Title VII, the ADEA, and the ADA to proceed. This ruling was significant because it established that third-party AI vendors, not just employers, can face liability for discriminatory outcomes produced by their tools.
The machine learning fairness literature has developed several formal concepts that relate directly to disparate treatment.
Fairness through unawareness (FTU) is the simplest approach to preventing disparate treatment: exclude protected attributes from the model's input features. In principle, if the model never "sees" race or gender, it cannot discriminate based on those characteristics. However, FTU is widely regarded as insufficient because models can infer protected attributes from proxy variables. Even when sensitive attributes are removed, algorithmic models may still learn such information through complex non-linear relationships in the data, perpetuating or amplifying systemic bias.
Fairness through awareness (FTA), proposed by Dwork et al. (2012), takes the opposite approach. Rather than ignoring protected attributes, FTA explicitly accounts for them. It defines a task-specific similarity metric that determines how similar two individuals are with respect to a particular decision. The fairness constraint requires that similar individuals receive similar treatment. This approach directly operationalizes the legal standard of treating "similarly situated individuals" equivalently.
Individual fairness requires that any two individuals who are similar with respect to a given task should receive similar predictions from the model. This concept is closely aligned with the legal notion of disparate treatment, which focuses on whether specific individuals were treated differently because of their protected attributes. The challenge lies in defining a meaningful similarity metric that captures task-relevant characteristics without incorporating protected attributes.
Group fairness metrics evaluate whether a model's outcomes are equitable across demographic groups. While these metrics are more commonly associated with disparate impact analysis, they can also help detect patterns consistent with disparate treatment.
| Metric | Definition | Connection to disparate treatment |
|---|---|---|
| Demographic parity | Equal selection rates across groups | Violations may indicate group-level differential treatment |
| Equalized odds | Equal true positive and false positive rates across groups | Unequal error rates may suggest the model treats groups differently |
| Calibration | Predicted probabilities reflect actual outcomes equally across groups | Miscalibration across groups may reflect biased feature reliance |
| Predictive parity | Equal positive predictive values across groups | Differences may indicate the model is less reliable for certain groups |
A key theoretical result is the impossibility theorem: demographic parity, equalized odds, and calibration cannot all be satisfied simultaneously except in trivial cases (a perfect predictor or equal base rates across groups). This means practitioners must choose which fairness criteria to prioritize based on the specific application and its legal and ethical requirements.
Researchers and practitioners have developed a range of techniques to detect, measure, and mitigate disparate treatment in machine learning systems. These methods are typically categorized by when they are applied in the ML pipeline.
Pre-processing techniques modify the training data before model training to reduce bias.
In-processing techniques incorporate fairness directly into the model training process.
Post-processing techniques adjust the model's outputs after training to achieve fairness.
Governments and regulatory bodies around the world have begun addressing algorithmic discrimination through legislation and guidance.
The U.S. relies primarily on existing civil rights statutes (Title VII, FHA, ECOA, ADA, ADEA) to address algorithmic discrimination. Key regulatory developments include:
The EU AI Act, which entered into force on August 1, 2024, with full applicability by August 2, 2026, establishes the world's first comprehensive legal framework for AI regulation.
Canada's Artificial Intelligence and Data Act (AIDA), proposed as part of Bill C-27, would require impact assessments for high-impact AI systems and impose obligations to mitigate risks of bias. Brazil, China, and several other countries have also introduced or proposed AI-specific legislation that addresses algorithmic discrimination.
Several conceptual and practical challenges complicate the application of the disparate treatment framework to machine learning systems.
Traditional disparate treatment law requires proof of discriminatory intent. Machines do not have intent; they execute mathematical operations on data. This creates a fundamental mismatch between the legal doctrine and the technology it must regulate. Courts have begun adapting by looking at the intent of the system's designers and operators rather than the system itself. If a company knows that its AI system produces discriminatory outcomes and continues to use it, courts may infer discriminatory intent.
Many modern ML models, particularly deep neural networks, operate as "black boxes" whose internal decision-making processes are difficult to interpret. This opacity makes it hard to determine whether a model is relying on protected attributes or their proxies. While interpretability techniques can provide some insight, they often offer approximate rather than definitive explanations of model behavior.
Proxy variables are pervasive in real-world data. Because many features in a typical dataset are correlated with protected attributes, it is extremely difficult to construct a model that is both accurate and completely free from proxy discrimination. Removing known proxies simply causes models to find less obvious ones through complex feature interactions. This creates a fundamental tension between predictive accuracy and fairness.
As demonstrated by the COMPAS case, different mathematical definitions of fairness can conflict with one another. A model that is calibrated across groups may still have unequal false positive rates. A model that achieves demographic parity may sacrifice calibration. Practitioners must make value judgments about which fairness criteria to prioritize, and those judgments may differ depending on the application domain, the stakes involved, and the legal requirements that apply.
Individuals belong to multiple protected groups simultaneously (for example, a person who is both Black and female). Disparate treatment analyses that consider only one protected attribute at a time may miss discrimination that affects intersectional groups. A model may treat Black applicants fairly on average and treat female applicants fairly on average, while still systematically disadvantaging Black women. Detecting and mitigating intersectional discrimination requires examining outcomes across combinations of protected attributes, which increases the complexity of fairness analysis considerably.
Organizations developing or deploying AI systems can take several steps to reduce the risk of disparate treatment.