See also: Machine learning terms, Bias, Selection bias
Reporting bias is a type of data bias in machine learning that occurs when the frequency of events, properties, or outcomes captured in a dataset does not accurately reflect their real-world frequency. The term was formalized in the context of AI by Jonathan Gordon and Benjamin Van Durme in their 2013 paper "Reporting Bias and Knowledge Acquisition," which showed that the information people choose to record in text is systematically different from what actually happens in the world. People tend to document circumstances that are unusual, surprising, or especially memorable, while assuming that the ordinary does not need to be stated. This creates a persistent gap between the distribution of information in training data and the true distribution of phenomena in reality.
Reporting bias is distinct from other forms of data bias such as selection bias or sampling bias. While selection bias arises from non-representative sampling of a population, reporting bias arises from what people choose to say or write about the things they observe. Even a perfectly representative sample of text documents will contain reporting bias because the bias is embedded in human communication itself.
Imagine you have a diary where you write about your day. You probably write about exciting things, like going to a birthday party or seeing a rainbow. You probably do not write "I breathed air today" or "the sky was up" because those things are so obvious that they seem boring. But if someone only read your diary to learn about your life, they might think you go to parties every day and never just sit at home. That is reporting bias: when the information people choose to write down makes rare or interesting things seem common and makes common things seem invisible. Computers that learn from text have the same problem. They read millions of documents and assume that what people write about a lot must happen a lot, and what people rarely mention must be rare, even when the opposite is true.
The linguistic roots of reporting bias can be traced to the philosopher Paul Grice and his theory of conversational implicature, published in 1975. Grice proposed four maxims that govern cooperative communication.
| Maxim | Principle | Connection to reporting bias |
|---|---|---|
| Quantity | Make your contribution as informative as required, but not more informative than required | Speakers omit information the listener already knows, causing obvious facts to be absent from text |
| Quality | Do not say what you believe to be false or lack evidence for | Writers may omit uncertain but common facts, reporting only facts they feel confident about |
| Relation | Be relevant | Writers mention properties and events that are noteworthy or relevant to their communicative goal, not all true properties |
| Manner | Be clear, brief, and orderly | Brevity encourages omission of background knowledge that would make communication unnecessarily long |
The Maxim of Quantity is the most directly responsible for reporting bias. Because speakers are expected not to state the obvious, common knowledge goes unrecorded. Gordon and Van Durme (2013) used this framework to explain why knowledge base construction from text systematically underrepresents trivial facts. Their paper received the Best Paper Award at the CIKM Workshop on Automated Knowledge Base Construction.
A central insight of reporting bias research is that the frequency with which something is mentioned in text does not correspond to its frequency in the world. This disconnect operates in two directions.
Under-reporting of common facts. People rarely state what is obvious or expected. The fact that "bananas are yellow" appears frequently in text, but "bananas are curved" appears far less often, even though curvature is just as characteristic of bananas as their color. Similarly, the Knext knowledge base system learned almost a million times that a person may have eyes, but fewer than 1,600 times that a person may have a spleen. Both are equally true, but eyes are discussed frequently in everyday contexts while the spleen is not.
Over-reporting of rare events. Unusual, dramatic, or newsworthy events are disproportionately recorded. The sentence "a person was struck by lightning" appears in text corpora far more often than its actual probability would suggest, because lightning strikes are noteworthy. Conversely, routine actions like walking to work or eating breakfast receive comparatively little textual coverage despite being vastly more common.
Reporting bias has a direct and well-documented impact on natural language processing systems. Language models trained on text corpora inherit the distributional patterns of those corpora, including all the distortions introduced by reporting bias.
Shwartz and Choi (2020) investigated whether neural network-based language models overcome reporting bias. Their findings, published at COLING 2020, revealed a mixed picture. On one hand, the generalization capacity of models like BERT allows them to better estimate the plausibility of frequent but rarely mentioned actions, outcomes, and properties. On the other hand, these models tend to overestimate the plausibility of very rare events, effectively amplifying the bias that already exists in their training corpus. In other words, neural language models partially compensate for the under-reporting of common facts but make the over-reporting of rare facts even worse.
This has practical consequences for several NLP tasks.
| NLP task | How reporting bias affects it |
|---|---|
| Sentiment analysis | Models trained on product reviews are skewed toward extreme opinions because people are more likely to write reviews when they feel strongly, either positively or negatively |
| Commonsense reasoning | Systems struggle to learn obvious facts (e.g., "ice is cold," "fire is hot") because these are rarely stated in text |
| Information extraction | Knowledge extraction systems overweight rare associations and underweight common ones |
| Text generation | Generated text may over-represent dramatic or unusual events relative to their actual likelihood |
| Question answering | Models may give confident but skewed answers about the frequency or typicality of events |
Reporting bias extends beyond text to affect computer vision and multimodal systems. When people write alt-text descriptions for images, compose captions, or annotate visual datasets, they make the same communicative choices that produce reporting bias in text. They describe what is interesting or unusual and omit what seems obvious.
Kamath et al. (2025) demonstrated this in a study titled "Scale Can't Overcome Pragmatics: The Impact of Reporting Bias on Vision-Language Reasoning." They analyzed the LAION dataset (containing billions of image-text pairs) and the COCO dataset (a widely used benchmark for object detection and captioning) and found that certain types of reasoning are severely underrepresented in captions.
| Reasoning type | LAION occurrence | COCO occurrence | With targeted prompting |
|---|---|---|---|
| Spatial reasoning | 0.1% | 3.7% | Not reported |
| Counting | 1.7% | 10.4% | Not reported |
| Temporal language | 0.2% | Restricted by design | 44% |
| Negations | 0.1% | Not reported | 52% |
The study found that open-source vision-language models fall approximately 54 points behind human performance on reasoning tasks affected by reporting bias. Scaling dataset size, model size, or adding multilingual data does not resolve the problem. Adding multilingual diversity by translating non-English captions did not improve model performance on reasoning tasks, showing that reporting bias is not specific to the English language.
The authors demonstrated that targeted annotator instructions can make a significant difference. When annotators were specifically prompted to include negations and temporal reasoning in their captions, the occurrence rates jumped to 52% and 44%, respectively, compared to near-zero in standard captioning pipelines.
Word embeddings such as Word2Vec and GloVe learn vector representations of words from large text corpora. Because these corpora reflect reporting bias (along with other social biases), the resulting embeddings encode skewed associations.
Caliskan, Bryson, and Narayanan (2017) published a landmark study in Science showing that word embeddings trained on standard internet text contain the same implicit biases measured by the Implicit Association Test (IAT) in humans. Female terms were closer to family and arts concepts, while male terms were closer to career and mathematics concepts. These associations reflect not just what is true in the world but how people choose to write about men and women, which is itself shaped by reporting bias and cultural stereotypes.
Bolukbasi et al. (2016) further demonstrated this problem with a now-famous analogy: the Word2Vec model trained on Google News articles produced the analogy "man is to computer programmer as woman is to homemaker." Their debiasing algorithm reduced the proportion of stereotypical word associations from 19% to 6% while preserving the embeddings' utility for other tasks.
Large language models (LLMs) such as GPT and BERT are trained on massive text corpora scraped from the internet, making them particularly susceptible to reporting bias at scale. Research published in PNAS in 2025 found that even value-aligned models that appear unbiased on standard benchmarks still form biased associations. The study evaluated eight aligned models, including GPT-3.5-turbo, GPT-4, Claude-3-Sonnet, and Claude-3-Opus, and found pervasive stereotype biases across four social categories: race, gender, religion, and health.
The problem is compounded because LLMs not only absorb reporting bias from their training data but can also generate new text that perpetuates and amplifies these biases. When LLM-generated text is used to train future models or to synthesize captions for vision-language datasets, reporting bias can cascade through successive generations of AI systems.
Commonsense knowledge resources are particularly vulnerable to reporting bias because they are often constructed from text or curated by human annotators who carry the same communicative tendencies. Researchers at the USC Information Sciences Institute studied two major commonsense knowledge bases and found significant levels of bias.
| Knowledge base | Percentage of biased entries | Types of bias found |
|---|---|---|
| ConceptNet | 3.4% | Gender stereotypes, occupational stereotypes |
| GenericsKB | 38.6% | Gender, religious, racial, and occupational biases |
The researchers, led by Fred Morstatter and including Ninareh Mehrabi, Pei Zhou, and Jay Pujara, found that algorithms trained on these databases amplified the biases beyond what was present in the original data. Specific problematic associations included linking Muslims to terrorism, Mexicans to poverty, lawyers to dishonesty, and policemen to death. Mehrabi noted that some results were so extreme the team questioned whether to include them in the publication.
Reporting bias takes on particular significance in healthcare AI, where biased data can lead to diagnostic errors and the perpetuation of health disparities. Clinical datasets used to train machine learning models are shaped by reporting bias in several ways.
Diagnosis codes and clinical notes reflect what clinicians choose to record, which is influenced by their training, their biases, and time constraints. Conditions that are less well-understood or that affect marginalized populations may be systematically under-documented. Social determinants of health, such as housing status, food security, and exposure to environmental toxins, are rarely captured in electronic health records despite being strong predictors of health outcomes.
A study published in Nature Medicine in 2021 found that AI systems applied to chest radiographs exhibited underdiagnosis bias in historically underserved populations, including female patients, Black patients, and patients of low socioeconomic status. The algorithm was more likely to label a person with a disease as healthy when that person belonged to an underserved group. Similarly, research has shown that melanoma detection errors are more prevalent among dark-skinned patients due to training dataset imbalances, and that underrepresentation of rural populations in training datasets has been linked to a 23% higher false-negative rate for pneumonia detection.
Reporting bias is one of several types of bias that can affect AI systems. Understanding the distinctions between them helps in identifying the appropriate mitigation strategy.
| Bias type | Source | When it occurs | Example |
|---|---|---|---|
| Reporting bias | Human communication patterns | During data creation; people document the unusual and omit the obvious | Text rarely states "the sky is blue" because it is assumed knowledge |
| Selection bias | Non-representative data collection | During data sampling; certain groups or instances are over- or under-represented | A medical study that only recruits patients from urban hospitals |
| Sampling bias | Flawed sampling methodology | During data collection; the method of drawing samples introduces systematic error | An online survey that excludes people without internet access |
| Automation bias | Human trust in automated systems | During model deployment; users over-rely on algorithmic outputs | A doctor accepting an AI diagnosis without independent verification |
| Historical bias | Past societal inequities | Before data collection; the world being measured already reflects inequality | Hiring data from an era when women were excluded from certain professions |
| Confirmation bias | Human cognitive tendency | During data interpretation; people seek evidence that confirms preexisting beliefs | A researcher only testing their model on cases where they expect it to succeed |
| Group attribution bias | Cognitive generalization | During data labeling or model design; individual traits are assumed to apply to entire groups | Assuming all members of a demographic group share the same preferences |
| Measurement bias | Inconsistent measurement tools | During data collection; tools or proxies measure differently across groups | Using arrest records as a proxy for crime rates, which conflates policing patterns with criminal behavior |
Addressing reporting bias requires interventions at multiple stages of the machine learning pipeline. No single technique eliminates reporting bias entirely, but a combination of approaches can reduce its impact.
Diversified data sources. Relying on a single type of text (e.g., news articles or Wikipedia) amplifies the reporting biases specific to that genre. Combining data from multiple sources with different communicative norms, such as academic papers, conversational transcripts, instructional texts, and fiction, can partially offset genre-specific reporting biases.
Targeted annotation. The findings of Kamath et al. (2025) show that specific annotator instructions can dramatically increase the representation of underreported information. When image annotators were prompted to include spatial, temporal, and negation information in their captions, the occurrence of these reasoning types increased from near-zero to over 40%. This suggests that reporting bias in annotated datasets can be reduced by designing annotation protocols that explicitly ask for the information that people would normally omit.
Active data collection. Rather than passively relying on existing text corpora, researchers can actively collect data that fills known gaps. For commonsense knowledge, this might involve asking annotators to state obvious properties of objects (e.g., "snow is white," "water is wet") that would normally go unsaid.
Data augmentation. Synthetic data generation can supplement underrepresented categories or properties. Techniques such as SMOTE (Synthetic Minority Oversampling Technique) and generative adversarial networks (GANs) can create additional training examples for underrepresented scenarios. However, these techniques must be applied carefully because synthetic data that does not reflect true diversity can reinforce rather than reduce bias.
Debiasing embeddings. The approach of Bolukbasi et al. (2016) demonstrated that word embeddings can be mathematically modified to remove specific biases while preserving useful properties. Their method identifies a "bias direction" in the embedding space and projects gender-neutral words away from it. Subsequent work has extended this to contextual embeddings and transformer-based models.
Adversarial training. Models can be trained with an adversarial objective that penalizes the model for relying on spurious correlations introduced by reporting bias. This approach forces the model to learn more robust representations that are less sensitive to distributional artifacts in the training data.
Regularization for fairness. Fairness-aware regularization techniques add constraints or penalty terms to the training objective that discourage the model from producing outputs that differ systematically across demographic groups or that overweight rare events.
Bias audits. Systematic evaluation of model outputs across different subgroups, event types, and property categories can reveal where reporting bias is affecting predictions. Tools like REVISE (for visual datasets) use statistical methods to inspect datasets for potential biases along object-based, gender-based, and geography-based dimensions.
Cross-validation with representative test sets. Standard cross-validation on biased data will not reveal reporting bias because the test set shares the same distributional skew as the training set. Evaluation should include test sets that are specifically designed to represent the true distribution of events and properties, not the distribution found in text.
Counterfactual evaluation. Testing models with counterfactual inputs (e.g., swapping demographic attributes while keeping all other features constant) can reveal whether the model has learned spurious associations introduced by reporting bias.
The National Institute of Standards and Technology (NIST) published Special Publication 1270, "Towards a Standard for Identifying and Managing Bias in Artificial Intelligence," in March 2022. The publication identifies three broad categories of AI bias: computational and statistical bias, human bias, and systemic bias. Reporting bias falls primarily under the human bias category, as it originates from the way humans communicate and record information rather than from any algorithmic error.
The NIST framework emphasizes that bias in AI is not solely a technical problem. Reporting bias illustrates this point well: it cannot be fully addressed by improving algorithms alone because the root cause lies in human communicative behavior. Effective mitigation requires a combination of technical interventions, thoughtful data collection practices, and ongoing auditing.
The study of reporting bias in AI has evolved significantly over the past decade.
| Year | Milestone | Researchers |
|---|---|---|
| 2013 | Formal definition of reporting bias in the context of knowledge acquisition from text | Jonathan Gordon, Benjamin Van Durme |
| 2016 | Demonstration that word embeddings encode gender stereotypes; development of debiasing algorithms | Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai |
| 2017 | Proof that word embeddings replicate human implicit biases as measured by the IAT | Aylin Caliskan, Joanna Bryson, Arvind Narayanan |
| 2020 | Investigation of whether neural language models overcome reporting bias (finding: partially, but they also amplify rare events) | Vered Shwartz, Yejin Choi |
| 2021 | Comprehensive survey of bias and fairness in machine learning, including reporting bias taxonomy | Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, Aram Galstyan |
| 2022 | Discovery that 3.4% to 38.6% of entries in commonsense knowledge bases contain bias | Fred Morstatter, Ninareh Mehrabi, Pei Zhou, Jay Pujara (USC ISI) |
| 2022 | NIST SP 1270 published, establishing a framework for identifying and managing AI bias | NIST |
| 2025 | Demonstration that scaling data and models cannot overcome reporting bias in vision-language reasoning | Amita Kamath et al. |