Unawareness (Fairness Through Unawareness)

Fairness through unawareness (FTU), also called anti-classification or blindness, is a fairness criterion in machine learning that prescribes excluding sensitive attributes (such as race, gender, age, or religion) from the input features of a predictive model. The core assumption is that if a decision-making system never sees protected characteristics, it cannot discriminate on the basis of those characteristics. While the idea is intuitive, it is widely regarded by researchers as insufficient for achieving fairness in practice, because other features in the data often encode the same information that the removed attribute carried.

The term was popularized by Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel in their 2012 paper "Fairness Through Awareness," which critiqued the unawareness approach and proposed an alternative framework based on individual fairness metrics.

Formal definition

Let $X$ denote the set of non-sensitive input features, $A$ denote the sensitive attribute (for example, race or gender), and $\hat{Y} = f(X, A)$ denote the output of a classifier. Fairness through unawareness requires that the classifier does not use $A$ as an input. Formally, a classifier $f$ satisfies FTU if and only if:

$$f(X, A) = f(X, A') \quad \text{for all } A, A'$$

In other words, two individuals who are identical in all non-sensitive features $X$ but differ in their sensitive attribute $A$ must receive the same prediction. This is equivalent to saying that $A$ is not an argument to the function $f$, so the classifier can be written simply as $f(X)$.

Corbett-Davies and Goel (2018) formalized this under the name "anti-classification," defining it as the requirement that protected attributes and their proxies are not explicitly used to make decisions.

Motivation

The appeal of fairness through unawareness stems from several practical and legal considerations:

Simplicity. Removing a column from a dataset is one of the easiest interventions a practitioner can make. No specialized fairness constraints or modified training procedures are needed.
Legal analogy. In many jurisdictions, anti-discrimination law prohibits explicit use of protected characteristics in decisions about hiring, lending, housing, and criminal sentencing. FTU mirrors the legal concept of disparate treatment, which forbids intentional discrimination based on protected class membership.
Intuitive fairness. The idea that "if you cannot see race, you cannot be racist" resonates with everyday notions of impartiality and is easy to explain to non-technical stakeholders.
Regulatory compliance. Some proposed regulations, such as a U.S. Department of Housing and Urban Development (HUD) rule, have considered creating safe harbors from disparate impact liability for algorithms that do not use protected class variables or close proxies.

How it works in practice

Implementing FTU typically involves two steps:

Identify sensitive attributes. The practitioner determines which features are protected. Common protected attributes include race, ethnicity, gender, age, religion, national origin, disability status, marital status, and sexual orientation. The specific list varies by jurisdiction and application domain.
Remove the sensitive attribute from the feature set. Before training the model, the protected columns are dropped from the dataset. The model is then trained on the remaining features $X$ and learns a decision function $f(X)$.

Some stricter implementations also attempt to identify and remove features that are highly correlated with the sensitive attribute (known as proxies), though this introduces its own challenges around information loss and feature selection.

ELI5 (Explain like I'm 5)

Imagine you are a teacher picking teams for a game. You know that some of your students are wearing red shirts and some are wearing blue shirts. If you pick teams based on shirt color, that is not fair. So you close your eyes and pick without looking at shirts. That is "fairness through unawareness." You are being fair by ignoring shirt color.

But here is the problem: what if students wearing red shirts are all on the left side of the room and students wearing blue shirts are all on the right? Even with your eyes closed, if you pick everyone from the left side, you end up picking only red-shirt students anyway. The seating arrangement acts as a secret clue (a "proxy") for shirt color. This is why just closing your eyes does not always work.

Limitations and criticisms

Fairness through unawareness has been criticized extensively in the academic literature. The consensus is that it is a necessary but not sufficient condition for fair decision-making, and in many cases it is not even necessary.

Proxy discrimination and redundant encoding

The most fundamental limitation of FTU is proxy discrimination. Sensitive attributes are rarely independent of other features. In real-world data, many non-sensitive features are correlated with protected characteristics, and a sufficiently powerful model can reconstruct the sensitive attribute from these correlated features.

Dwork et al. (2012) introduced the concept of redundant encoding to describe this phenomenon: even after removing a sensitive attribute, its information remains encoded across multiple other features. As stated in the "Fairness and Machine Learning" textbook by Barocas, Hardt, and Narayanan: "sensitive attributes are generally redundant given the other features. The classifier will then find a redundant encoding in terms of the other features."

For example, in the United States:

Proxy feature	Correlated sensitive attribute	Mechanism
ZIP code / neighborhood	Race, ethnicity	Residential segregation means geographic location is a strong predictor of racial demographics
First name or surname	Race, ethnicity, gender	Naming patterns differ systematically across demographic groups
Occupation	Gender	Occupational segregation persists across industries
Alma mater	Race, socioeconomic status	College attendance patterns correlate with demographic background
Browsing history or purchasing patterns	Multiple attributes	Consumer behavior reflects demographic differences
Language or dialect features	Ethnicity, national origin	Linguistic patterns vary across demographic groups

The proxy problem means that a model can effectively discriminate along protected dimensions without ever receiving the protected attribute as a direct input. Multiple weakly correlated features can combine to reconstruct the protected attribute with high accuracy.

Information loss and the fairness-accuracy tradeoff

Removing sensitive attributes can harm predictive accuracy. When the sensitive attribute carries legitimate predictive information (i.e., information that is relevant to the outcome for non-discriminatory reasons), excluding it forces the model to make predictions without useful signal.

Corbett-Davies and Goel (2018) demonstrated this with a recidivism prediction example. Because women have substantially lower base rates of reoffending than men, a gender-blind model systematically overestimates recidivism risk for women and underestimates it for men. Excluding gender in this context actually creates a disparate impact on women by assigning them unfairly high risk scores. This is why some criminal justice jurisdictions explicitly allow gender-specific risk assessment tools.

More generally, when base rates for the outcome of interest differ across demographic groups, removing the group attribute can cause the model to produce systematically biased predictions for one or more groups, a phenomenon sometimes referred to as infra-marginality.

Context-specific failures

In some domains, awareness of sensitive attributes is necessary for accurate and equitable outcomes:

Medicine. Clinical prediction models sometimes require knowledge of a patient's sex, age, or ethnicity because biological differences affect disease prevalence and presentation. For example, the estimated glomerular filtration rate (eGFR) equation historically included a race adjustment for Black patients. While that specific adjustment has been criticized and revised, the underlying issue (that some clinical measurements vary systematically by demographic group) remains relevant.
Affirmative action. Policies designed to counteract historical disadvantage require awareness of protected characteristics to function.
Disparate impact analysis. Detecting whether a system produces disparate impact requires access to demographic data. Removing that data makes it impossible to audit the system for bias.

Cannot prevent intentional discrimination

FTU provides no protection against bad actors. An agent who intends to discriminate can deliberately include proxy variables that are highly correlated with the protected attribute. The system will appear compliant with FTU (the sensitive attribute is not used) while effectively replicating discrimination through proxies. This has been described as "laundering bias through software."

Real-world case studies

Amazon's resume screening tool

In 2014, Amazon developed a machine learning system to automate resume screening for technical positions. The system was trained on resumes of existing employees, who were predominantly male in technical roles. Although the model did not explicitly use gender as a feature, it learned to penalize resumes that contained words associated with women, such as "women's" (as in "women's chess club captain") and the names of certain all-women's colleges. Verbs commonly found on male engineers' resumes, such as "executed" and "captured," were favored. Amazon abandoned the project in 2017 after determining that the bias could not be reliably eliminated.

COMPAS recidivism algorithm

The Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) system, developed by Northpointe (now Equivant), was designed to assess the likelihood of criminal recidivism. Race was not used as an explicit input feature. However, a 2016 investigation by ProPublica found that Black defendants who did not go on to reoffend were nearly twice as likely as white defendants to be classified as high risk. Northpointe countered that the algorithm satisfied predictive parity (the probability of recidivating, given a high risk score, was similar across racial groups). Both claims were correct, illustrating how different fairness metrics can yield contradictory conclusions and how FTU did not prevent racially disparate outcomes.

Algorithmic redlining in lending

Historically, financial institutions engaged in "redlining" by refusing to lend in predominantly Black neighborhoods. Modern algorithmic lending systems that exclude race as an input can still replicate this pattern through ZIP codes and other geographic features. A 2021 investigation by The Markup found that lenders were 80 percent more likely to reject Black applicants than comparable white applicants, even when race was not an explicit input to the lending algorithms. This phenomenon is sometimes called digital redlining or algorithmic redlining.

Facebook advertising

Facebook's advertising platform allowed advertisers to restrict ad visibility based on race, religion, and national origin. Even after removing explicit targeting options for protected characteristics, the platform's ad delivery optimization algorithms continued to produce discriminatory outcomes by using proxy signals embedded in user behavior and interest data.

Comparison with other fairness definitions

Fairness through unawareness is one of several competing fairness criteria. The table below summarizes how it compares to other common definitions.

Fairness definition	Core idea	Relationship to FTU
Fairness through unawareness	Do not use $A$ as input	The baseline approach
Demographic parity	Positive prediction rates must be equal across groups	Requires access to $A$ for auditing; can hold even when FTU is violated
Equalized odds	True positive and false positive rates must be equal across groups	Requires access to $A$ and $Y$; addresses limitations of FTU
Equality of opportunity	True positive rates must be equal across groups	A relaxation of equalized odds; requires $A$
Predictive parity	Positive predictive values must be equal across groups	Can conflict with equalized odds; requires $A$
Individual fairness	Similar individuals should receive similar predictions	Requires a distance metric on individuals; proposed as direct replacement for FTU
Counterfactual fairness	Predictions should not change in a counterfactual world where $A$ is different	Uses causal models; generalizes FTU

A key insight from the incompatibility of fairness metrics literature is that most of these definitions cannot all be satisfied simultaneously (except in degenerate cases), making the choice of fairness criterion context-dependent.

Alternatives to fairness through unawareness

Because FTU is widely considered insufficient, researchers have developed more sophisticated approaches to achieving fairness. These can be grouped by where in the machine learning pipeline they intervene.

Preprocessing methods

Preprocessing methods modify the training data before the model is trained:

Reweighting. Assign different weights to training examples to equalize outcome distributions across groups.
Resampling. Over-sample under-represented groups or under-sample over-represented groups.
Fair representation learning. Zemel et al. (2013) proposed learning an intermediate representation of the data that encodes useful information for the prediction task while removing information about the sensitive attribute. This approach simultaneously optimizes for group fairness and individual fairness.
Label flipping. Modify the labels of some training examples across groups to reduce bias in the training signal.

In-processing methods

In-processing methods modify the learning algorithm itself:

Fairness constraints. Add constraints to the optimization objective that enforce a specific fairness criterion (such as demographic parity or equalized odds) during training.
Adversarial debiasing. Train an adversary that tries to predict the sensitive attribute from the model's predictions; the main model is penalized for making predictions that allow the adversary to succeed.
Prejudice remover. Kamishima et al. (2012) proposed a regularization term that penalizes mutual information between predictions and sensitive attributes.
Fairness through awareness. Dwork et al. (2012) proposed using a Lipschitz condition to ensure that similar individuals (as measured by a task-specific distance metric) receive similar outcomes. This is the direct counterpoint to FTU.

Post-processing methods

Post-processing methods adjust the outputs of a trained model:

Threshold adjustment. Hardt, Price, and Srebro (2016) showed that any classifier can be post-processed to satisfy equalized odds by choosing group-specific decision thresholds.
Calibration. Adjust predicted probabilities so that they are well-calibrated within each demographic group.
Reject option classification. Assign uncertain predictions to a human decision-maker rather than the algorithm.

Causal approaches

Causal methods use causal inference to model the relationships between sensitive attributes, other features, and outcomes:

Counterfactual fairness. Kusner et al. (2017) defined a prediction as fair if it would remain the same in a counterfactual world where the individual's sensitive attribute were different. This approach explicitly models how the sensitive attribute causally affects other features and the outcome.
Causal reasoning for discrimination. Kilbertus et al. (2017) proposed using directed acyclic graphs (DAGs) to identify which causal pathways from the sensitive attribute to the outcome constitute discrimination, enabling the blocking of discriminatory pathways while preserving non-discriminatory ones.

Relationship to legal frameworks

Fairness through unawareness has a complex relationship with anti-discrimination law.

Legal concept	Description	Relationship to FTU
Disparate treatment	Intentional discrimination based on a protected characteristic	FTU prevents explicit use of protected attributes, aligning with disparate treatment doctrine
Disparate impact	Facially neutral practices that disproportionately affect a protected group	FTU does not prevent disparate impact; a model can satisfy FTU while producing discriminatory outcomes through proxies
Business necessity	A practice with disparate impact is lawful if justified by business necessity	May justify retaining some features correlated with protected attributes
Affirmative action	Policies that consider protected attributes to address historical disadvantage	Requires awareness of protected attributes, contradicting FTU

The tension between disparate treatment and disparate impact doctrines is directly relevant to FTU. Satisfying FTU (no explicit use of protected attributes) protects against disparate treatment claims but provides no guarantee against disparate impact claims. This creates a practical dilemma: organizations may need to be aware of sensitive attributes to detect and mitigate disparate impact, even as they avoid using those attributes in decision-making.

Historical context

The idea of achieving fairness by ignoring protected characteristics has a long history outside of machine learning. The concept is sometimes linked to Justice John Marshall Harlan's dissent in Plessy v. Ferguson (1896), in which he argued that "our constitution is color-blind." In the machine learning literature, the concept was formalized and critiqued in the following timeline:

Year	Development
2008	Pedreschi, Ruggieri, and Turini publish "Discrimination-aware data mining," one of the first papers to formally study discrimination in algorithmic systems
2010	Calders and Verwer propose three modified Naive Bayes classifiers for discrimination-free classification
2012	Dwork, Hardt, Pitassi, Reingold, and Zemel publish "Fairness Through Awareness," critiquing unawareness and proposing individual fairness as an alternative
2013	Zemel, Wu, Swersky, Pitassi, and Dwork propose "Learning Fair Representations" to remove sensitive information from data representations
2016	Hardt, Price, and Srebro propose "Equality of Opportunity in Supervised Learning" with equalized odds as a group fairness criterion
2016	ProPublica publishes investigation of COMPAS, bringing public attention to algorithmic fairness
2017	Kusner, Loftus, Russell, and Silva propose counterfactual fairness using causal inference
2018	Corbett-Davies and Goel publish "The Measure and Mismeasure of Fairness," formally categorizing FTU as "anti-classification"
2018	Amazon's biased resume screening tool is publicly reported

Current research and open problems

Several active research areas relate to fairness through unawareness:

Measuring fairness without access to sensitive attributes. In many real-world settings, sensitive attributes are not available at prediction time due to legal restrictions or data collection limitations. Researchers are developing methods to measure and enforce fairness criteria without direct access to $A$, using techniques such as proxy estimation and quantification-based approaches.
Model multiplicity. Recent work has examined FTU through the lens of model multiplicity, which refers to the existence of many equally accurate models that make different predictions for specific individuals. Even when FTU is satisfied, different models in the set of near-optimal classifiers can produce different fairness outcomes.
Process fairness. Grgic-Hlaca, Zafar, Gummadi, and Weller (2016) proposed measuring fairness by asking whether people consider the features used by a model to be appropriate, rather than focusing solely on outcomes. This shifts attention from whether a sensitive attribute is used to whether any feature in the model is perceived as unfair.
Auditing under unawareness. Methods for auditing algorithmic fairness when the sensitive attribute is not observed are an active area, including approaches based on counterfactual reasoning and proxy variable detection.

References

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). "Fairness Through Awareness." *Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS)*. ACM. https://arxiv.org/abs/1104.3913
Corbett-Davies, S., & Goel, S. (2018). "The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning." *arXiv preprint arXiv:1808.00023*. https://arxiv.org/abs/1808.00023
Hardt, M., Price, E., & Srebro, N. (2016). "Equality of Opportunity in Supervised Learning." *Advances in Neural Information Processing Systems 29 (NeurIPS)*. https://arxiv.org/abs/1610.02413
Kusner, M., Loftus, J., Russell, C., & Silva, R. (2017). "Counterfactual Fairness." *Advances in Neural Information Processing Systems 30 (NeurIPS)*. https://arxiv.org/abs/1703.06856
Zemel, R., Wu, Y., Swersky, K., Pitassi, T., & Dwork, C. (2013). "Learning Fair Representations." *Proceedings of the 30th International Conference on Machine Learning (ICML)*, PMLR 28:325-333. https://proceedings.mlr.press/v28/zemel13.html
Pedreschi, D., Ruggieri, S., & Turini, F. (2008). "Discrimination-aware data mining." *Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 560-568. https://dl.acm.org/doi/10.1145/1401890.1401959
Calders, T., & Verwer, S. (2010). "Three naive Bayes approaches for discrimination-free classification." *Data Mining and Knowledge Discovery*, 21, 277-292. https://link.springer.com/article/10.1007/s10618-010-0190-x
Barocas, S., Hardt, M., & Narayanan, A. (2019). *Fairness and Machine Learning: Limitations and Opportunities*. https://fairmlbook.org/
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). "Machine Bias." *ProPublica*. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Dastin, J. (2018). "Amazon scraps secret AI recruiting tool that showed bias against women." *Reuters*. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G
Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., & Weller, A. (2016). "The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making." *NIPS Symposium on Machine Learning and the Law*.
Kilbertus, N., Carulla, M.R., Parascandolo, G., Hardt, M., Janzing, D., & Scholkopf, B. (2017). "Avoiding Discrimination through Causal Reasoning." *Advances in Neural Information Processing Systems 30 (NeurIPS)*. https://arxiv.org/abs/1706.02744
Martinez, E., & Kirchner, L. (2021). "The Secret Bias Hidden in Mortgage-Approval Algorithms." *The Markup*. https://themarkup.org/denied/2021/08/25/the-secret-bias-hidden-in-mortgage-approval-algorithms
Veale, M., & Binns, R. (2017). "Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data." *Big Data & Society*, 4(2). https://journals.sagepub.com/doi/10.1177/2053951717743530

Formal definition

Motivation

How it works in practice

ELI5 (Explain like I'm 5)

Limitations and criticisms

Proxy discrimination and redundant encoding

Information loss and the fairness-accuracy tradeoff

Context-specific failures

Cannot prevent intentional discrimination

Real-world case studies

Amazon's resume screening tool

COMPAS recidivism algorithm

Algorithmic redlining in lending

Facebook advertising

Comparison with other fairness definitions

Alternatives to fairness through unawareness

Preprocessing methods

In-processing methods

Post-processing methods

Causal approaches

Relationship to legal frameworks

Historical context

Current research and open problems

See also

References

Improve this article

Related Articles

Machine learning terms/Fairness

Automation Bias

Equality of Opportunity

Out-Group Homogeneity Bias

Predictive Parity

ARC-AGI 2

Formal definition

Motivation

How it works in practice

ELI5 (Explain like I'm 5)

Limitations and criticisms

Proxy discrimination and redundant encoding

Information loss and the fairness-accuracy tradeoff

Context-specific failures

Cannot prevent intentional discrimination

Real-world case studies

Amazon's resume screening tool

COMPAS recidivism algorithm

Algorithmic redlining in lending

Facebook advertising

Comparison with other fairness definitions

Alternatives to fairness through unawareness

Preprocessing methods

In-processing methods

Post-processing methods

Causal approaches

Relationship to legal frameworks

Historical context

Current research and open problems

See also

References

Related Articles

Machine learning terms/Fairness

Automation Bias

Equality of Opportunity

Out-Group Homogeneity Bias

Predictive Parity

ARC-AGI 2