Individual fairness is a principle in machine learning and artificial intelligence that requires similar individuals to be treated similarly by an algorithm. Introduced formally by Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Rich Zemel in their 2012 paper "Fairness Through Awareness," the concept is mathematically expressed as a Lipschitz condition on the classifier's mapping function. Unlike group fairness notions that focus on statistical parity across demographic groups, individual fairness operates at the level of pairs of individuals and demands that if two people are close according to a task-specific similarity metric, the outcomes they receive should also be close.
The idea originates from an intuitive ethical principle: a just decision-making process should not assign wildly different outcomes to people who are essentially alike in all relevant respects. While simple in concept, implementing individual fairness in practice has proved to be a persistent challenge, primarily because defining the "right" similarity metric for a given task requires normative judgments that are difficult to formalize.
Imagine you and your friend both draw very similar pictures for an art contest. Individual fairness says the judges should give you both about the same score, because your drawings are so similar. It would not be fair if the judges gave your friend a much higher score when your pictures look almost the same. In machine learning, individual fairness works the same way: if a computer is making decisions about people (like who gets a loan or who gets into a school), and two people look really similar on paper, the computer should treat them in a similar way.
The concept of individual fairness emerged from broader concerns about algorithmic discrimination in automated decision-making systems. Before Dwork et al.'s 2012 formalization, the most common approach to fairness in algorithms was "fairness through unawareness" (also called "blindness"), which simply removed protected attributes like race or gender from the input features. However, researchers recognized that this approach was fundamentally flawed because other features could serve as proxies for the removed attributes. For example, a lending model that ignores race but uses zip code as a feature may still discriminate, since residential patterns in many countries are strongly correlated with race.
Dwork et al. proposed "fairness through awareness" as an alternative, arguing that a principled approach to fairness must explicitly account for the ways in which individuals are similar or different with respect to the task at hand. Their framework was published at the 3rd Innovations in Theoretical Computer Science Conference (ITCS 2012) and has since become one of the most cited works in the algorithmic fairness literature.
The intellectual roots of individual fairness can be traced to Aristotle's principle of formal equality, often summarized as "treat like cases alike." In legal philosophy, this principle underlies anti-discrimination law and the concept of disparate treatment. The contribution of Dwork et al. was to translate this philosophical idea into a precise mathematical constraint that could be applied to classification models and other predictive systems.
The formal definition of individual fairness relies on two distance metrics and a Lipschitz-like condition that connects them.
Let X be the space of individuals and Y be the space of outcomes (for example, probability distributions over decisions). Let d: X x X -> R be a task-specific distance metric on individuals, and let D: Y x Y -> R be a distance metric on outcomes. A mapping (classifier) M: X -> Y satisfies individual fairness if for all pairs of individuals x, y in X:
D(M(x), M(y)) <= d(x, y)
In words, the distance between the outcomes assigned to two individuals must not exceed the distance between those individuals in the input space. This is a Lipschitz condition with Lipschitz constant L = 1.
More generally, a classifier satisfies (d, D)-Lipschitz fairness with constant L if:
D(M(x), M(y)) <= L * d(x, y)
The key components of this definition are:
| Component | Symbol | Description |
|---|---|---|
| Individual space | X | The set of all possible individuals, represented by their features |
| Outcome space | Y | The set of possible outcomes or decisions (often probability distributions over labels) |
| Task-specific metric | d(x, y) | A distance function measuring how similar two individuals are with respect to the classification task |
| Outcome metric | D(M(x), M(y)) | A distance function measuring how different the outcomes are for two individuals |
| Lipschitz constant | L | An upper bound on how much the classifier can amplify differences between individuals |
| Classifier | M: X -> Y | The mapping from individuals to outcomes that must satisfy the fairness constraint |
When the outcome space Y consists of probability distributions over binary decisions (accept or reject), D is typically taken to be the total variation distance or the statistical distance between distributions.
The most distinctive and controversial element of the individual fairness framework is the task-specific similarity metric d(x, y). This metric is supposed to capture the idea that two individuals who are "equally qualified" for a given task should have a small distance between them, regardless of their values on protected attributes.
Dwork et al. acknowledged that specifying this metric is a significant challenge. They treated it as an external input to the framework, sometimes described as being provided by a "regulatory body" or a "domain expert" who can articulate which differences between individuals are relevant to the task and which are not. For example:
The metric d is not just a technical detail; it encodes substantive normative judgments about what counts as relevant similarity for the decision at hand. Different stakeholders may disagree about the correct metric, and there is no algorithm that can derive the "right" metric from data alone without making value-laden choices.
Because manually specifying the similarity metric is often impractical, researchers have developed several approaches to learn or approximate it:
| Approach | Key idea | Representative work |
|---|---|---|
| Human judgment queries | An algorithm queries a human "fairness arbiter" about pairs of individuals and uses their responses to learn the metric | Ilvento (2020), "Metric Learning for Individual Fairness" |
| Data-driven learning | Learn the metric from labeled data indicating which pairs of individuals should be treated similarly | Mukherjee et al. (2020), "Two Simple Ways to Learn Individual Fairness Metrics from Data" |
| Online learning with unknown metric | Learn the metric through an interactive process where a regulator flags fairness violations | Gillen et al. (2018), "Online Learning with an Unknown Fairness Metric" |
| Causal graph-based | Derive similarity from a causal model of the data-generating process | Kusner et al. (2017), Counterfactual Fairness |
| Preference-informed | Allow individuals to express preferences over outcomes, relaxing strict metric requirements | Kim et al. (2020), "Preference-Informed Fairness" |
Individual fairness and group fairness represent two distinct families of fairness metrics with fundamentally different perspectives. Group fairness criteria, such as demographic parity, equalized odds, and calibration, require that some statistical quantity be equal across predefined demographic groups. Individual fairness, by contrast, does not reference groups at all; it operates entirely on pairs of individuals.
| Dimension | Individual fairness | Group fairness |
|---|---|---|
| Unit of analysis | Pairs of individuals | Demographic groups |
| Core principle | Similar individuals get similar outcomes | Protected groups get equal aggregate outcomes |
| Required input | Task-specific similarity metric | Definition of protected groups |
| Granularity | Fine-grained, per-individual | Coarse-grained, per-group average |
| Sensitivity to subgroups | Handles arbitrary subpopulations naturally | May miss unfairness within subgroups |
| Formal basis | Lipschitz condition on classifier | Statistical parity constraints |
| Example criteria | Metric fairness, Lipschitz fairness | Demographic parity, equalized odds, calibration |
A natural question is whether individual fairness and group fairness can be satisfied simultaneously. The answer depends on the specific definitions and the data distribution, but several results highlight tensions between them.
Dwork et al. (2012) showed that individual fairness can imply a form of group fairness under certain conditions. If the similarity metric is chosen so that members of different demographic groups who are equally qualified have a small distance, then treating them similarly (individual fairness) will naturally lead to similar aggregate outcomes for the groups (group fairness).
However, the reverse does not hold in general. Satisfying group fairness can require violating individual fairness. For instance, to achieve demographic parity when base rates differ between groups, a classifier may need to treat similarly qualified individuals from different groups differently, which directly violates the individual fairness constraint.
Binns (2020) argued that the apparent conflict between individual and group fairness is partly based on a misconception, and that the two approaches address different aspects of fairness that can be complementary rather than contradictory. More recently, Fleisher (2021) examined philosophical foundations and concluded that individual fairness is best understood not as a standalone definition of fairness, but as one tool among several for addressing algorithmic bias.
In a 2024 paper published in the Journal of Artificial Intelligence Research, the connection between individual fairness and a group-level criterion called "base rate tracking" was established, showing that the Lipschitz condition underlying individual fairness is closely related to certain group fairness properties. This result suggests that the divide between the two families of fairness criteria may be less stark than commonly assumed.
Since the original formulation, researchers have proposed several variants and extensions of individual fairness to address its limitations or adapt it to new settings.
Rothblum and Yona (2018) observed that strict individual fairness (requiring the Lipschitz condition to hold for every pair of individuals) does not generalize well from a training set to the full population. They introduced a relaxed notion called "probably approximately metric-fair" learning, analogous to the PAC (Probably Approximately Correct) learning framework. In this formulation, the fairness constraint need only hold for a random pair of individuals sampled from the population, with all but a small probability of error. The authors showed that this relaxed definition is both learnable and computationally tractable for classes of linear and logistic predictors.
Kusner, Loftus, Russell, and Silva (2017) introduced counterfactual fairness, which can be seen as an individual-level fairness notion grounded in causal reasoning rather than a distance metric. A decision is counterfactually fair toward an individual if the outcome would remain the same in a counterfactual world where the individual belonged to a different demographic group. Formally, for a protected attribute A, features X, and predictor R:
P(R_{A<-a} = r | A = a, X = x) = P(R_{A<-b} = r | A = a, X = x)
This means that the probability of outcome r should be the same whether the individual's protected attribute takes value a (the actual value) or b (the counterfactual value), conditional on the individual's other characteristics. Counterfactual fairness requires specifying a causal model rather than a distance metric, which some researchers consider more interpretable.
Kim, Reingold, and Rothblum (2018) proposed "fairness through computationally-bounded awareness," introducing the notion of metric multifairness. This relaxation does not require the full similarity metric to be known in advance. Instead, the learner can query the metric a bounded number of times. Metric multifairness guarantees that similar subpopulations (as identified by a rich collection of comparison sets) are treated similarly, while being computationally feasible.
Kim, Korolova, Rothblum, and Yona (2020) identified a subtle issue with standard individual fairness: when individuals have diverse preferences over outcomes, enforcing strict similarity of outcomes can actually harm the people the constraint is meant to protect. They proposed preference-informed individual fairness (PIIF), which allows deviations from strict metric fairness as long as those deviations align with individuals' own preferences. PIIF is a relaxation of both individual fairness and envy-freeness.
Zemel, Wu, Swersky, Pitassi, and Dwork (2013) proposed an approach to achieving individual fairness through learned representations. Their method formulates fairness as an optimization problem: find a representation of the data that encodes useful information for the prediction task while obfuscating membership in protected groups. The approach uses a semi-supervised clustering model where each cluster receives roughly equal proportions of data from each group. Later work by Louizos et al. (2015) extended this idea using variational autoencoders (the Variational Fair Autoencoder, or VFAE).
Implementing individual fairness in real-world systems presents several interrelated challenges that have limited its adoption compared to group fairness criteria.
The single largest obstacle to deploying individual fairness is the requirement to specify a task-specific similarity metric. This problem has several facets:
Normative disagreement. Reasonable people can disagree about what makes two individuals "similar" for a given decision. In hiring, should two candidates with different educational backgrounds but equal work experience be considered similar? The answer depends on value judgments that cannot be resolved by technical means alone.
High dimensionality. Real-world individuals are described by many features, and defining a meaningful distance in a high-dimensional feature space is non-trivial. Different distance functions (Euclidean, Mahalanobis, cosine similarity) can yield very different notions of who is "similar."
Context dependence. The appropriate metric varies by task. Two individuals who are similar for a lending decision may not be similar for a medical diagnosis or a hiring decision. There is no universal similarity metric.
Implicit bias risk. If the similarity metric is learned from human judgments, it may encode the implicit biases of the humans providing those judgments, potentially undermining the fairness guarantee.
Strict individual fairness requires checking the Lipschitz condition for all pairs of individuals, which scales quadratically with the number of data points. For large datasets, this can be computationally prohibitive. Approximate methods and sampling-based approaches (such as the PACF framework) help, but they introduce a gap between the theoretical guarantee and what is achieved in practice.
Enforcing individual fairness as a constraint typically reduces the model's predictive accuracy, since the model must sacrifice some performance on its primary task to satisfy the fairness constraint. The magnitude of this trade-off depends on the tightness of the Lipschitz condition, the choice of metric, and the specific data distribution. In some cases, the accuracy cost is small; in others, it can be substantial.
Verifying that a deployed model satisfies individual fairness is more difficult than checking group fairness criteria. Group fairness can be audited using standard statistical tests on observable outcomes. Individual fairness, by contrast, requires access to the similarity metric and knowledge of how the model behaves on all nearby pairs of individuals, which is harder to assess externally.
Fleisher (2021) identified four in-principle problems with individual fairness as both a definition and a method for ensuring fairness:
Insufficiency. Counterexamples show that treating similar individuals similarly is not sufficient to guarantee fairness. A classifier could satisfy the Lipschitz condition while still producing systematically unfair outcomes, for instance if the metric itself encodes unjust criteria.
Bias encoding. Methods for learning the similarity metric from human judgments risk encoding implicit biases of the human annotators. If the "fairness arbiter" has biased views about which individuals are similar, the resulting metric will perpetuate those biases.
Prior moral judgments. Specifying the similarity metric requires making substantive moral judgments about what counts as relevant similarity. This means individual fairness does not provide an independent criterion for fairness; it presupposes answers to the very normative questions it aims to resolve.
Incommensurability. For many tasks, the relevant moral values are incommensurable, meaning they cannot be reduced to a single numerical distance. Combining factors like merit, need, and desert into a single metric may be inherently impossible for certain decision contexts.
These criticisms do not invalidate individual fairness as a useful tool, but they suggest it should be used alongside other fairness criteria rather than as a sole measure.
Individual fairness has been studied in several application domains where automated decision-making raises concerns about discrimination.
In credit scoring and lending decisions, individual fairness requires that two applicants with similar creditworthiness receive similar loan terms or approval decisions, regardless of their membership in protected groups. Research has explored how individual fairness interacts with regulatory requirements such as the Equal Credit Opportunity Act (ECOA) in the United States, which prohibits discrimination on the basis of race, color, religion, national origin, sex, marital status, or age. The challenge in this domain is defining a similarity metric that captures creditworthiness without relying on features correlated with protected attributes.
Automated hiring systems, including resume screening tools and automated interview assessment platforms, have drawn scrutiny for potential discriminatory effects. Individual fairness provides a framework for requiring that similarly qualified candidates receive similar consideration, but specifying what "similarly qualified" means requires agreement on the relevant qualifications and how to weight them.
Risk assessment instruments used in criminal sentencing, bail decisions, and parole hearings have been a major focus of fairness research. Individual fairness would require that defendants with similar risk profiles receive similar risk scores, regardless of race or other protected characteristics. The COMPAS recidivism prediction tool, which was the subject of a widely cited 2016 ProPublica investigation, highlighted tensions between different fairness criteria in this domain.
Clinical prediction models and triage algorithms that allocate scarce medical resources raise individual fairness concerns. Two patients with similar medical conditions and prognoses should receive similar treatment recommendations, regardless of their demographic characteristics. Research has examined how individual fairness applies to algorithmic decision-making in healthcare, including organ transplant allocation and emergency department triage.
Recent research has extended individual fairness to graph neural networks, where the non-IID (non-independent and identically distributed) nature of graph data creates unique challenges. Standard individual fairness methods assume each data point is independent, but in graph settings, an individual's representation depends on their neighbors through message-passing operations. Approaches like GUIDE (Song et al., 2022) and GFairHint (2023) have been developed to integrate both group and individual fairness in GNN frameworks. Testing individual fairness in GNNs requires specialized perturbation techniques that preserve graph structure, such as maintaining node degree and neighborhood consistency.
Several open-source tools support the evaluation and enforcement of fairness in machine learning systems, though direct support for individual fairness is less common than for group fairness metrics.
| Tool | Developer | Individual fairness support | Notes |
|---|---|---|---|
| AI Fairness 360 (AIF360) | IBM Research | Partial | Provides some individual fairness metrics and several bias mitigation algorithms; primarily focused on group fairness |
| Fairlearn | Microsoft | Limited | Focuses on group fairness metrics and mitigation; provides framework extensible to individual fairness |
| Google What-If Tool | Limited | Visualization tool for exploring model behavior on individual data points; can be used for manual fairness inspection | |
| Aequitas | University of Chicago | Limited | Bias audit toolkit with some support for individual-level fairness analysis |
Dwork et al. (2012) noted a connection between individual fairness and differential privacy. Both concepts involve constraining how much a function's output can change in response to small changes in its input. In differential privacy, the constraint applies to changes in a single record in a database; in individual fairness, the constraint applies to changes in an individual's features. Formally, both can be expressed as Lipschitz conditions, though the distance metrics and the contexts differ.
This connection has practical implications. Techniques developed for achieving differential privacy, such as noise addition and output perturbation, can sometimes be adapted to achieve individual fairness. However, the two properties can also conflict: adding noise for privacy may violate fairness constraints, and enforcing fairness constraints may reduce the privacy guarantee.
Individual fairness has connections to legal doctrines of non-discrimination, though the relationship is not straightforward.
In United States law, disparate treatment (intentional discrimination) roughly corresponds to the idea that individuals should not be treated differently on the basis of protected characteristics. Individual fairness formalizes a related but distinct concept: similarly situated individuals should receive similar outcomes. The legal notion of disparate impact (unintentional discrimination with disproportionate effects on protected groups) is more closely aligned with group fairness criteria.
The EU AI Act (Regulation (EU) 2024/1689), which entered into force in August 2024, establishes a risk-based regulatory framework for AI systems. While the Act addresses non-discrimination requirements for high-risk AI systems, it does not use the term "fairness" from the algorithmic fairness literature. The Act primarily regulates the input side for high-risk systems (data quality, documentation, human oversight) rather than specifying particular fairness criteria for algorithmic outputs. This gap between the technical fairness community's vocabulary and the legal community's language remains an area of active discussion.