Fairness through unawareness (FTU), also called anti-classification or blindness, is a fairness criterion in machine learning that prescribes excluding sensitive attributes (such as race, gender, age, or religion) from the input features of a predictive model. The core assumption is that if a decision-making system never sees protected characteristics, it cannot discriminate on the basis of those characteristics. While the idea is intuitive, it is widely regarded by researchers as insufficient for achieving fairness in practice, because other features in the data often encode the same information that the removed attribute carried.
The term was popularized by Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel in their 2012 paper "Fairness Through Awareness," which critiqued the unawareness approach and proposed an alternative framework based on individual fairness metrics.
Let $X$ denote the set of non-sensitive input features, $A$ denote the sensitive attribute (for example, race or gender), and $\hat{Y} = f(X, A)$ denote the output of a classifier. Fairness through unawareness requires that the classifier does not use $A$ as an input. Formally, a classifier $f$ satisfies FTU if and only if:
$$f(X, A) = f(X, A') \quad \text{for all } A, A'$$
In other words, two individuals who are identical in all non-sensitive features $X$ but differ in their sensitive attribute $A$ must receive the same prediction. This is equivalent to saying that $A$ is not an argument to the function $f$, so the classifier can be written simply as $f(X)$.
Corbett-Davies and Goel (2018) formalized this under the name "anti-classification," defining it as the requirement that protected attributes and their proxies are not explicitly used to make decisions.
The appeal of fairness through unawareness stems from several practical and legal considerations:
Implementing FTU typically involves two steps:
Some stricter implementations also attempt to identify and remove features that are highly correlated with the sensitive attribute (known as proxies), though this introduces its own challenges around information loss and feature selection.
Imagine you are a teacher picking teams for a game. You know that some of your students are wearing red shirts and some are wearing blue shirts. If you pick teams based on shirt color, that is not fair. So you close your eyes and pick without looking at shirts. That is "fairness through unawareness." You are being fair by ignoring shirt color.
But here is the problem: what if students wearing red shirts are all on the left side of the room and students wearing blue shirts are all on the right? Even with your eyes closed, if you pick everyone from the left side, you end up picking only red-shirt students anyway. The seating arrangement acts as a secret clue (a "proxy") for shirt color. This is why just closing your eyes does not always work.
Fairness through unawareness has been criticized extensively in the academic literature. The consensus is that it is a necessary but not sufficient condition for fair decision-making, and in many cases it is not even necessary.
The most fundamental limitation of FTU is proxy discrimination. Sensitive attributes are rarely independent of other features. In real-world data, many non-sensitive features are correlated with protected characteristics, and a sufficiently powerful model can reconstruct the sensitive attribute from these correlated features.
Dwork et al. (2012) introduced the concept of redundant encoding to describe this phenomenon: even after removing a sensitive attribute, its information remains encoded across multiple other features. As stated in the "Fairness and Machine Learning" textbook by Barocas, Hardt, and Narayanan: "sensitive attributes are generally redundant given the other features. The classifier will then find a redundant encoding in terms of the other features."
For example, in the United States:
| Proxy feature | Correlated sensitive attribute | Mechanism |
|---|---|---|
| ZIP code / neighborhood | Race, ethnicity | Residential segregation means geographic location is a strong predictor of racial demographics |
| First name or surname | Race, ethnicity, gender | Naming patterns differ systematically across demographic groups |
| Occupation | Gender | Occupational segregation persists across industries |
| Alma mater | Race, socioeconomic status | College attendance patterns correlate with demographic background |
| Browsing history or purchasing patterns | Multiple attributes | Consumer behavior reflects demographic differences |
| Language or dialect features | Ethnicity, national origin | Linguistic patterns vary across demographic groups |
The proxy problem means that a model can effectively discriminate along protected dimensions without ever receiving the protected attribute as a direct input. Multiple weakly correlated features can combine to reconstruct the protected attribute with high accuracy.
Removing sensitive attributes can harm predictive accuracy. When the sensitive attribute carries legitimate predictive information (i.e., information that is relevant to the outcome for non-discriminatory reasons), excluding it forces the model to make predictions without useful signal.
Corbett-Davies and Goel (2018) demonstrated this with a recidivism prediction example. Because women have substantially lower base rates of reoffending than men, a gender-blind model systematically overestimates recidivism risk for women and underestimates it for men. Excluding gender in this context actually creates a disparate impact on women by assigning them unfairly high risk scores. This is why some criminal justice jurisdictions explicitly allow gender-specific risk assessment tools.
More generally, when base rates for the outcome of interest differ across demographic groups, removing the group attribute can cause the model to produce systematically biased predictions for one or more groups, a phenomenon sometimes referred to as infra-marginality.
In some domains, awareness of sensitive attributes is necessary for accurate and equitable outcomes:
FTU provides no protection against bad actors. An agent who intends to discriminate can deliberately include proxy variables that are highly correlated with the protected attribute. The system will appear compliant with FTU (the sensitive attribute is not used) while effectively replicating discrimination through proxies. This has been described as "laundering bias through software."
In 2014, Amazon developed a machine learning system to automate resume screening for technical positions. The system was trained on resumes of existing employees, who were predominantly male in technical roles. Although the model did not explicitly use gender as a feature, it learned to penalize resumes that contained words associated with women, such as "women's" (as in "women's chess club captain") and the names of certain all-women's colleges. Verbs commonly found on male engineers' resumes, such as "executed" and "captured," were favored. Amazon abandoned the project in 2017 after determining that the bias could not be reliably eliminated.
The Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) system, developed by Northpointe (now Equivant), was designed to assess the likelihood of criminal recidivism. Race was not used as an explicit input feature. However, a 2016 investigation by ProPublica found that Black defendants who did not go on to reoffend were nearly twice as likely as white defendants to be classified as high risk. Northpointe countered that the algorithm satisfied predictive parity (the probability of recidivating, given a high risk score, was similar across racial groups). Both claims were correct, illustrating how different fairness metrics can yield contradictory conclusions and how FTU did not prevent racially disparate outcomes.
Historically, financial institutions engaged in "redlining" by refusing to lend in predominantly Black neighborhoods. Modern algorithmic lending systems that exclude race as an input can still replicate this pattern through ZIP codes and other geographic features. A 2021 investigation by The Markup found that lenders were 80 percent more likely to reject Black applicants than comparable white applicants, even when race was not an explicit input to the lending algorithms. This phenomenon is sometimes called digital redlining or algorithmic redlining.
Facebook's advertising platform allowed advertisers to restrict ad visibility based on race, religion, and national origin. Even after removing explicit targeting options for protected characteristics, the platform's ad delivery optimization algorithms continued to produce discriminatory outcomes by using proxy signals embedded in user behavior and interest data.
Fairness through unawareness is one of several competing fairness criteria. The table below summarizes how it compares to other common definitions.
| Fairness definition | Core idea | Relationship to FTU |
|---|---|---|
| Fairness through unawareness | Do not use $A$ as input | The baseline approach |
| Demographic parity | Positive prediction rates must be equal across groups | Requires access to $A$ for auditing; can hold even when FTU is violated |
| Equalized odds | True positive and false positive rates must be equal across groups | Requires access to $A$ and $Y$; addresses limitations of FTU |
| Equality of opportunity | True positive rates must be equal across groups | A relaxation of equalized odds; requires $A$ |
| Predictive parity | Positive predictive values must be equal across groups | Can conflict with equalized odds; requires $A$ |
| Individual fairness | Similar individuals should receive similar predictions | Requires a distance metric on individuals; proposed as direct replacement for FTU |
| Counterfactual fairness | Predictions should not change in a counterfactual world where $A$ is different | Uses causal models; generalizes FTU |
A key insight from the incompatibility of fairness metrics literature is that most of these definitions cannot all be satisfied simultaneously (except in degenerate cases), making the choice of fairness criterion context-dependent.
Because FTU is widely considered insufficient, researchers have developed more sophisticated approaches to achieving fairness. These can be grouped by where in the machine learning pipeline they intervene.
Preprocessing methods modify the training data before the model is trained:
In-processing methods modify the learning algorithm itself:
Post-processing methods adjust the outputs of a trained model:
Causal methods use causal inference to model the relationships between sensitive attributes, other features, and outcomes:
Fairness through unawareness has a complex relationship with anti-discrimination law.
| Legal concept | Description | Relationship to FTU |
|---|---|---|
| Disparate treatment | Intentional discrimination based on a protected characteristic | FTU prevents explicit use of protected attributes, aligning with disparate treatment doctrine |
| Disparate impact | Facially neutral practices that disproportionately affect a protected group | FTU does not prevent disparate impact; a model can satisfy FTU while producing discriminatory outcomes through proxies |
| Business necessity | A practice with disparate impact is lawful if justified by business necessity | May justify retaining some features correlated with protected attributes |
| Affirmative action | Policies that consider protected attributes to address historical disadvantage | Requires awareness of protected attributes, contradicting FTU |
The tension between disparate treatment and disparate impact doctrines is directly relevant to FTU. Satisfying FTU (no explicit use of protected attributes) protects against disparate treatment claims but provides no guarantee against disparate impact claims. This creates a practical dilemma: organizations may need to be aware of sensitive attributes to detect and mitigate disparate impact, even as they avoid using those attributes in decision-making.
The idea of achieving fairness by ignoring protected characteristics has a long history outside of machine learning. The concept is sometimes linked to Justice John Marshall Harlan's dissent in Plessy v. Ferguson (1896), in which he argued that "our constitution is color-blind." In the machine learning literature, the concept was formalized and critiqued in the following timeline:
| Year | Development |
|---|---|
| 2008 | Pedreschi, Ruggieri, and Turini publish "Discrimination-aware data mining," one of the first papers to formally study discrimination in algorithmic systems |
| 2010 | Calders and Verwer propose three modified Naive Bayes classifiers for discrimination-free classification |
| 2012 | Dwork, Hardt, Pitassi, Reingold, and Zemel publish "Fairness Through Awareness," critiquing unawareness and proposing individual fairness as an alternative |
| 2013 | Zemel, Wu, Swersky, Pitassi, and Dwork propose "Learning Fair Representations" to remove sensitive information from data representations |
| 2016 | Hardt, Price, and Srebro propose "Equality of Opportunity in Supervised Learning" with equalized odds as a group fairness criterion |
| 2016 | ProPublica publishes investigation of COMPAS, bringing public attention to algorithmic fairness |
| 2017 | Kusner, Loftus, Russell, and Silva propose counterfactual fairness using causal inference |
| 2018 | Corbett-Davies and Goel publish "The Measure and Mismeasure of Fairness," formally categorizing FTU as "anti-classification" |
| 2018 | Amazon's biased resume screening tool is publicly reported |
Several active research areas relate to fairness through unawareness: