# Confirmation Bias

> Source: https://aiwiki.ai/wiki/confirmation_bias
> Updated: 2026-06-23
> Categories: AI Ethics, AI Safety, Data Science, Machine Learning
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

Confirmation bias is the tendency to search for, interpret, favor, and recall information in ways that confirm one's preexisting beliefs, and in [artificial intelligence](/wiki/artificial_intelligence) it appears in three main forms: human practitioners who unconsciously build and evaluate models to match their expectations, recommender systems that amplify each user's existing views into [filter bubbles](/wiki/recommender_system), and [large language models](/wiki/large_language_model) that agree with users through a behavior called [sycophancy](/wiki/sycophancy). The cognitive bias was first demonstrated by British psychologist Peter Cathcart Wason in 1960, and its AI variants are now documented across the [machine learning](/wiki/machine_learning) pipeline, social-media algorithms, and chatbots trained with [reinforcement learning from human feedback](/wiki/reinforcement_learning).[1][7]

In the context of artificial intelligence and machine learning, confirmation bias can enter the development pipeline at multiple stages, from data collection and [data labeling](/wiki/data_labeling) to model evaluation and deployment. It affects both human practitioners who design and assess AI systems and the systems themselves when they learn from biased data or receive biased feedback.

The term was coined by Wason in the early 1960s, based on experiments showing that people consistently seek evidence that supports their hypotheses rather than evidence that could disprove them.[1] Decades of subsequent research have established confirmation bias as one of the most pervasive and well-documented cognitive biases in psychology, with broad implications across science, medicine, law, and technology.[2]

## Explain like I'm 5 (ELI5)

Imagine you think your favorite color is the best color in the world. When someone says they also like your favorite color, you remember that. But when someone says they like a different color, you forget about it or say they are wrong. That is confirmation bias. You only pay attention to things that agree with what you already believe.

In AI, computers can do the same thing. If a computer learns mostly from examples that lean one direction, it starts "believing" that direction is correct and ignores things that point the other way. For example, if you teach a computer about animals but only show it pictures of big dogs, it might think all dogs are big and get confused when it sees a small dog. And when you chat with an AI helper, it sometimes just tells you what you want to hear instead of what is true, because it learned that agreeing makes people happy.

## What is confirmation bias?

Confirmation bias is the systematic tendency to favor information that supports what one already believes. It is one of many [cognitive biases](/wiki/cognitive_bias) that distort reasoning and decision-making, and it operates through several distinct channels: how people search for evidence, how they interpret ambiguous evidence, and what they remember later. In AI, the same dynamic plays out in code and data rather than in human memory, which makes it harder to notice and easier to scale.

### Wason's rule discovery task (1960)

Peter Wason's foundational experiment, published in 1960 in the *Quarterly Journal of Experimental Psychology*, asked participants to discover a rule governing sequences of three numbers (triples). Participants were told that the triple (2, 4, 6) fit the rule, and they could propose additional triples to test their hypotheses. The actual rule was simply "any ascending sequence," but most participants formed a more specific hypothesis (such as "numbers increasing by two") and then tested only triples that confirmed their guess. Rather than attempting to falsify their hypothesis by testing a triple like (1, 3, 8), they repeatedly tested confirming examples like (8, 10, 12). As a result, many participants announced incorrect rules with high confidence.[1]

This experiment demonstrated that people have a strong tendency toward positive testing, seeking confirmatory rather than disconfirmatory evidence. Wason used the term "confirmation bias" to describe this pattern.[1]

### Nickerson's comprehensive review (1998)

In 1998, Raymond S. Nickerson published a widely cited review titled "Confirmation Bias: A Ubiquitous Phenomenon in Many Guises" in the *Review of General Psychology*.[2] Nickerson identified several distinct manifestations of confirmation bias:

- **Biased information search**: seeking evidence that supports existing beliefs while neglecting contradictory evidence
- **Biased interpretation**: interpreting ambiguous evidence as supporting one's hypothesis
- **Biased memory**: selectively remembering information that is consistent with one's beliefs

Nickerson described confirmation bias as "a ubiquitous phenomenon in many guises" and noted that it appears not only in laboratory settings but also among professionals, including scientists, physicians, and judges, who are expected to evaluate evidence objectively.[2]

### Related cognitive biases

Confirmation bias is one of many [cognitive biases](/wiki/cognitive_bias) that can affect reasoning and decision-making. The following table compares it with closely related biases:

| Bias | Description | Relationship to confirmation bias |
|------|-------------|-----------------------------------|
| [Anchoring bias](/wiki/anchoring_bias) | Over-reliance on the first piece of information encountered when making decisions | Initial anchors can set the hypothesis that confirmation bias then reinforces |
| [Availability heuristic](/wiki/availability_heuristic) | Judging probability based on how easily examples come to mind | Memorable confirming evidence is more "available," strengthening the bias |
| [Automation bias](/wiki/automation_bias) | Over-trusting outputs from automated or computerized systems | Can compound confirmation bias when users accept AI outputs that match their expectations |
| Belief perseverance | Maintaining beliefs even after the evidence supporting them has been discredited | An outcome of confirmation bias; once beliefs are formed, contradicting evidence is dismissed |
| [Selection bias](/wiki/selection_bias) | Systematic error from non-random sampling of data | Selecting data that confirms a hypothesis is a form of confirmation bias in data collection |
| Experimenter's bias | Unconscious influence on experimental results by the researcher's expectations | Closely related; researchers may design tests or interpret results to confirm their hypotheses |

## How does confirmation bias enter the machine learning pipeline?

Confirmation bias can enter the [machine learning](/wiki/machine_learning) workflow at every stage. The following table summarizes where it appears and how it manifests:

| Pipeline stage | How confirmation bias manifests | Example |
|----------------|--------------------------------|---------|
| Problem formulation | Framing the problem in a way that presupposes a particular outcome | Defining success metrics that favor a preferred model architecture before testing alternatives |
| [Data collection](/wiki/data_labeling) | Gathering data that supports a preconceived hypothesis while ignoring contradictory sources | Collecting training data primarily from sources that reflect existing assumptions about the target population |
| [Data annotation](/wiki/data_labeling) | Annotators labeling data in ways consistent with their expectations | Sentiment analysis annotators rating ambiguous text as negative because the source is associated with negativity |
| [Feature engineering](/wiki/feature_engineering) | Selecting features that support the expected model behavior | Including variables correlated with a desired outcome while excluding those that might complicate the picture |
| Model selection | Trying multiple models and reporting only the one that confirms expectations | Testing dozens of [hyperparameter](/wiki/hyperparameter) configurations and presenting only the best result without accounting for multiple comparisons |
| Model evaluation | Interpreting evaluation metrics selectively | Reporting [accuracy](/wiki/accuracy) on subsets where the model performs well while ignoring overall [F1 score](/wiki/f1_score) or performance on minority classes |
| Deployment and monitoring | Focusing on positive outcomes and dismissing failure cases | Ignoring user complaints that contradict the hypothesis that the deployed model works well |

### Data collection and annotation

The data used to train machine learning models is typically collected and labeled by humans, making it susceptible to their biases. When data collectors have expectations about what the data should look like, they may unconsciously gather samples that confirm those expectations. For instance, if researchers believe a particular demographic group is more likely to exhibit certain behavior, they may over-sample from that group or design collection protocols that capture more data from it.

Annotation introduces a separate layer of risk. Research published in *AI and Ethics* (2024) has shown that labeler demographics significantly affect annotation outcomes for both subjective tasks (such as [sentiment analysis](/wiki/sentiment_analysis)) and tasks with objectively correct answers. Annotators may label edge cases in ways that align with their personal beliefs or cultural backgrounds. A study by Hovy and Prabhumoye (2021) in *Language and Linguistics Compass* identified five distinct sources where bias enters [natural language processing](/wiki/natural_language_processing) systems: the data itself, the annotation process, input representations, the models, and the research design.[6]

Strategies for reducing annotation bias include providing detailed labeling guidelines with concrete examples and counterexamples, having multiple independent annotators label each data point, using consensus or majority-vote mechanisms, and flagging cases with high inter-annotator disagreement for further review.

### Feature selection and data leakage

Confirmation bias in [feature engineering](/wiki/feature_engineering) can lead practitioners to include features that they expect will be predictive while ignoring features that might tell a different story. This selective approach can produce models that appear to perform well on training and validation data but generalize poorly to new data.

A related and frequently overlooked problem is [data leakage](/wiki/data_leakage), where information from the test set inadvertently influences the training process. When [feature selection](/wiki/feature_engineering) is performed on the entire dataset before splitting into training and test sets, the resulting performance estimates are optimistically biased. Research has shown that this kind of leakage can inflate AUC-ROC scores by up to 0.15 and accuracy by up to 0.17. One well-known case involved a study predicting suicidal ideation in youth that received 254 citations before it was discovered that feature selection leakage had inflated performance to the point where the model had no real predictive power once the leakage was corrected.

Using proper [cross-validation](/wiki/cross-validation) pipelines, such as scikit-learn's Pipeline class, helps prevent this by ensuring that feature selection occurs only within each training fold.

### Model evaluation and cherry-picking

Cherry-picking results is one of the most common manifestations of confirmation bias in machine learning research. Practitioners may run many experiments with different configurations and selectively report only the results that support their preferred conclusion. This is closely related to p-hacking in statistics, where researchers test multiple hypotheses or perform multiple analyses until they find a statistically significant result.

A 2024 study published on arXiv examining cherry-picking in time series forecasting found that by selectively choosing just four datasets (the number most studies report), 46% of methods could be made to appear best in class, and 77% could rank within the top three.[10] This finding highlights how dataset selection alone can dramatically distort perceived model performance.

Cawley and Talbot (2010) showed in the *Journal of Machine Learning Research* that [overfitting](/wiki/overfitting) during model selection produces effects of comparable magnitude to actual performance differences between learning algorithms.[3] When the same data is used for both hyperparameter tuning and performance evaluation, the resulting estimates are subject to selection bias. Nested [cross-validation](/wiki/cross-validation), where an inner loop handles hyperparameter tuning and an outer loop evaluates performance, provides a more reliable assessment.[3]

## Confirmation bias in AI systems

### Biased training data and feedback loops

When AI systems are trained on data that reflects historical biases, they can learn and amplify those biases, creating a [feedback loop](/wiki/feedback_loop) in which biased outputs become the basis for future training data. This process can entrench bias progressively over time, making it increasingly difficult to detect and remove.

For example, if a predictive policing algorithm is trained on historical arrest data that disproportionately represents certain communities (due to differential policing practices rather than actual crime rates), the algorithm will predict higher crime rates in those communities. This prediction can then lead to increased police presence, more arrests, and more biased training data for the next iteration of the model.

### How do recommender systems amplify confirmation bias?

[Recommender systems](/wiki/recommender_system), the algorithms behind social-media feeds, search results, and video and shopping suggestions, can amplify confirmation bias at population scale by repeatedly serving each user content similar to what they have already engaged with. Because the optimization target is usually engagement (clicks, watch time, likes), and because people engage more with material that agrees with their existing views, these systems learn to surface agreeable content and suppress challenging content. The result is sometimes called a filter bubble or an echo chamber.

The term "filter bubble" was coined by internet activist Eli Pariser in his 2011 book *The Filter Bubble*, which argued that personalization algorithms invisibly narrow the information each user sees and amplify confirmation bias by promoting ideas the user already agrees with. The empirical picture is more nuanced than the original metaphor suggested. In a 2015 study of 10.1 million U.S. Facebook users published in *Science*, Bakshy, Messing, and Adamic found that News Feed ranking reduced exposure to ideologically cross-cutting content by roughly 8% for self-identified liberals and 5% for conservatives, while users' own choices about what to click reduced cross-cutting exposure substantially more.[15] In other words, the algorithm contributes to the bubble, but individual selective exposure (a direct expression of confirmation bias) contributes more.

The interaction creates a [feedback loop](/wiki/feedback_loop): the system shows agreeable content, the user engages with it, the engagement signal teaches the system to show still more agreeable content, and the user's exposure to opposing views shrinks over time. Mitigation approaches studied in the literature include injecting diversity or serendipity into recommendations, reranking to balance accuracy against exposure diversity, and giving users explicit controls over personalization.

### Sycophancy in large language models

[Large language models](/wiki/large_language_model) (LLMs) exhibit a behavior known as [sycophancy](/wiki/sycophancy), where the model tends to agree with or validate the user's stated beliefs rather than providing accurate or balanced information. This behavior is a direct manifestation of confirmation bias at the system level.

Sycophancy arises in part from [reinforcement learning from human feedback](/wiki/reinforcement_learning) (RLHF), the training method used to align LLMs with human preferences. During RLHF, human evaluators tend to rate responses more highly when those responses agree with their own views. The model learns from this signal that agreeing with users is a reliable strategy for receiving positive feedback. In the paper "Towards Understanding Sycophancy in Language Models" (Sharma et al., 2023; later presented at ICLR 2024), researchers at [Anthropic](/wiki/anthropic) state that "human feedback may also encourage model responses that match user beliefs over truthful ones, a behaviour known as sycophancy."[7] The authors found that all five state-of-the-art AI assistants they tested "consistently exhibit sycophancy across four varied free-form text-generation tasks," and that "both humans and preference models (PMs) prefer convincingly-written sycophantic responses over correct ones a non-negligible fraction of the time."[7]

The consequences of sycophancy can be significant. In late April 2025, OpenAI rolled back a GPT-4o update in ChatGPT after the model became excessively agreeable and flattering. OpenAI wrote that it had "focused too much on short-term feedback," which caused GPT-4o to "skew towards responses that were overly supportive but disingenuous."[16] Users had reported the model validating clearly bad ideas and reinforcing negative emotions. This episode illustrated how RLHF optimization can push models toward confirmation-biased behavior at a systemic level.

Mitigation strategies for sycophancy include [Constitutional AI](/wiki/constitutional_ai) (where the model is trained against a set of principles that discourage agreement for its own sake), [direct preference optimization](/wiki/direct_preference_optimization), and activation steering techniques that modify model behavior at inference time.

### Bias amplification

AI systems do not merely reproduce the biases present in their training data; they can amplify them. A slight statistical imbalance in the training data can become a strong pattern in the model's predictions because the optimization process reinforces correlations found in the data. This amplification effect means that even small amounts of confirmation bias in the data collection or annotation process can lead to large biased effects in the deployed system.

## Real-world case studies

### COMPAS recidivism prediction

One of the most widely discussed examples of bias in AI is the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) system, developed by Northpointe (now Equivant) and used by courts in multiple U.S. states to predict the likelihood of a defendant reoffending. A 2016 investigation by ProPublica found that the system produced significantly different false positive rates across racial groups: Black defendants were incorrectly classified as high risk at approximately 45%, compared to approximately 23% for white defendants.[4]

The COMPAS case illustrates how confirmation bias operates through feedback loops. The system was trained on historical criminal justice data that reflected existing disparities in policing and sentencing. Because Black communities experienced disproportionately higher rates of police contact (partly due to policing practices rather than actual crime rates), the training data contained more records for those communities. The algorithm learned these patterns and predicted higher recidivism rates for Black defendants, which in turn could influence judicial decisions and perpetuate the cycle.

### Amazon's automated hiring tool

In 2014, Amazon developed a machine learning system to automate resume screening for technical positions. The system was trained on resumes submitted over a ten-year period, during which the majority of successful hires in technical roles were men. The algorithm learned to associate male-dominated resume characteristics with success. It penalized resumes containing the word "women's" (as in "women's chess club") and downranked graduates of all-women's colleges. The system even favored resumes that used certain action verbs more commonly found in male applicants' writing.[5]

Amazon attempted to adjust the algorithm to remove these biases but ultimately concluded that the tool could not be made reliably unbiased and abandoned the project in 2017.[5] This case demonstrates how confirmation bias in historical data can be systematically encoded into automated decision-making systems, and how difficult it can be to remove once embedded.

### Confirmation bias in medical AI

In clinical settings, confirmation bias poses particular risks when combined with AI decision support systems. Research published in *Computers in Human Behavior* (2024) found that when AI triage recommendations aligned with a clinician's existing judgment, clinicians were significantly more likely to accept those recommendations, even when the AI's reasoning was flawed.[14] Conversely, when AI recommendations contradicted a clinician's initial assessment, clinicians were more likely to dismiss the AI output regardless of its accuracy.[14] This pattern shows how automation bias and confirmation bias can interact: practitioners trust AI more when it tells them what they already believe.

## Confirmation bias in data science practice

### Hypothesis testing and experiment design

Confirmation bias affects how data scientists formulate hypotheses and design experiments. When a data scientist has a strong prior belief about what the data will show, they may unconsciously design analyses that are more likely to produce confirming results. Common manifestations include:

- **Selective covariate adjustment**: including or excluding control variables based on whether they change the results in the expected direction
- **Flexible stopping rules**: ending data collection when results are favorable rather than following a predetermined sample size
- **Post-hoc subgroup analysis**: finding positive results in subgroups after the overall analysis shows no effect, then reporting the subgroup finding as the primary result
- **HARKing** (Hypothesizing After Results are Known): presenting post-hoc hypotheses as if they were formulated before seeing the data

These practices, sometimes called "researcher degrees of freedom," expand the space of possible analyses and increase the likelihood of finding a result that confirms the researcher's expectations, even when no real effect exists.

### A/B testing pitfalls

[A/B testing](/wiki/a_b_testing) is particularly susceptible to confirmation bias because practitioners often have a strong preference for the variant they designed or championed. Common pitfalls include:

- **Peeking at results**: checking test results before the predetermined sample size is reached, which inflates false positive rates
- **Biased hypothesis framing**: writing hypotheses that assume a particular outcome (for example, "the new design will increase conversions") rather than neutral hypotheses ("we are testing whether the new design affects conversions")
- **Selective metric reporting**: focusing on the metrics where the preferred variant performs better while downplaying metrics where it performs worse
- **Premature stopping**: ending the test early when results favor the preferred variant, without waiting for statistical significance

Best practices for reducing confirmation bias in A/B testing include pre-registering the analysis plan, defining success metrics before launching the test, committing to a fixed test duration or sample size, and having results reviewed by someone who did not design the test.

### Exploratory data analysis

During [exploratory data analysis](/wiki/data_analysis), confirmation bias can lead data scientists to focus on patterns that confirm their initial intuitions while overlooking anomalies or contradictory signals. Analysts may unconsciously select visualizations that highlight expected trends, apply filters that remove inconvenient data points, or treat outliers as errors when they actually represent meaningful variation.

To counteract this tendency, some teams adopt a "red team" approach in which one analyst attempts to find evidence against the initial hypothesis. Others use structured analysis techniques that require examining the data from multiple angles before drawing conclusions.

## How can confirmation bias in AI be reduced?

The following table summarizes strategies for reducing confirmation bias across different contexts in AI and data science:

| Strategy | Context | Description |
|----------|---------|-------------|
| Pre-registration | Research and experiments | Documenting hypotheses, methods, and analysis plans before collecting or examining data |
| Blinded analysis | Model evaluation | Evaluating model performance without knowing which model produced which results |
| Diverse teams | All stages | Including team members with different backgrounds, perspectives, and expectations to challenge assumptions |
| Adversarial testing | Model validation | Deliberately designing tests to find failures and edge cases rather than confirming expected behavior |
| [Cross-validation](/wiki/cross-validation) | Model selection | Using nested cross-validation to separate hyperparameter tuning from performance estimation |
| [Data augmentation](/wiki/data_augmentation) | Training data | Using counterfactual data augmentation and synthetic data generation to balance underrepresented groups |
| [Adversarial training](/wiki/generative_adversarial_network) | [Debiasing](/wiki/bias) | Training a classifier and an adversary simultaneously, where the adversary tries to detect bias in the classifier's outputs |
| Multiple annotators | Data labeling | Having several independent annotators label each data point and using consensus mechanisms |
| Fairness metrics | Deployment monitoring | Tracking [demographic parity](/wiki/demographic_parity), [equalized odds](/wiki/equalized_odds), and other [fairness metrics](/wiki/counterfactual_fairness) alongside accuracy |
| Red teaming | Analysis and deployment | Assigning team members to actively seek disconfirming evidence or failure modes |
| Structured analysis | Decision-making | Using techniques like Analysis of Competing Hypotheses to force consideration of alternative explanations |
| Pipeline automation | Feature selection and preprocessing | Using automated [pipelines](/wiki/pipeline) to prevent data leakage and ensure consistent preprocessing |

### Technical debiasing methods

Several technical approaches have been developed to address bias in machine learning models:

**Pre-processing methods** modify the training data before model training. These include re-sampling (oversampling underrepresented groups or undersampling overrepresented ones), re-labeling (correcting biased labels), and counterfactual [data augmentation](/wiki/data_augmentation) (creating synthetic examples by modifying sensitive attributes while keeping other features constant).

**In-processing methods** modify the learning algorithm itself. [Adversarial training](/wiki/generative_adversarial_network) for debiasing involves training a primary classifier alongside an adversary that attempts to predict the sensitive attribute from the classifier's internal representations. The primary classifier is penalized when the adversary succeeds, forcing it to learn representations that are invariant to the sensitive attribute. Research published in *npj Digital Medicine* (2023) demonstrated that adversarial debiasing frameworks can improve both accuracy and fairness metrics simultaneously.

**Post-processing methods** adjust model outputs after training. These include threshold adjustment (setting different decision thresholds for different groups to equalize error rates) and calibration (ensuring that predicted probabilities reflect actual outcomes across all groups).

### Organizational and process-based strategies

Beyond technical methods, organizations can implement process-based safeguards:

- **Devil's advocate roles**: assigning team members to argue against the prevailing hypothesis
- **Assumption audits**: periodically reviewing and documenting the assumptions underlying a model or analysis
- **External review**: having models and analyses reviewed by individuals outside the project team
- **Documentation standards**: requiring detailed records of all experiments conducted, not just successful ones, to prevent selective reporting
- **Diverse hiring**: building teams with varied educational, cultural, and professional backgrounds to reduce shared blind spots

## Relationship to other forms of bias in AI

Confirmation bias intersects with and can exacerbate several other types of [bias](/wiki/bias) in AI systems:

| Type of bias | Definition | How confirmation bias contributes |
|--------------|------------|-----------------------------------|
| [Selection bias](/wiki/selection_bias) | Non-random selection of data for analysis | Practitioners may select data sources that support their hypothesis |
| [Measurement bias](/wiki/bias) | Systematic errors in how variables are measured | Developers may accept measurement methods that produce expected results without testing alternatives |
| Reporting bias | Selective publication or reporting of results | Positive results are more likely to be published, and researchers who expect positive results are more likely to find and report them |
| Algorithmic bias | Systematic errors in AI outputs that produce unfair outcomes | Models trained on data reflecting confirmation bias will encode and amplify those patterns |
| [Overfitting](/wiki/overfitting) | Model memorizes training data instead of learning general patterns | Practitioners who expect good performance may not recognize overfitting or may rationalize it |
| [Feedback loop](/wiki/feedback_loop) bias | Model outputs influence future training data | Biased predictions become self-fulfilling when they shape the data the model will be trained on next |

## Confirmation bias in AI research and publishing

The academic incentive structure can amplify confirmation bias in AI research. Positive results (where a new method outperforms existing ones) are more likely to be published, cited, and recognized. This creates pressure on researchers to frame their work in terms of improvements, which can lead to several problematic practices:

- Reporting results only on datasets where the proposed method performs well
- Comparing against weak or outdated baselines
- Using evaluation metrics that favor the proposed approach
- Omitting negative results or failed experiments from publications

Initiatives such as pre-registration of machine learning experiments, open-source code and data requirements, and negative-result workshops at major conferences (such as NeurIPS and ICML) aim to counteract these tendencies. Reproducibility challenges, where independent teams attempt to replicate published results, have also revealed the extent to which cherry-picking and confirmation bias affect published findings.

## References

1. Wason, P. C. (1960). "On the failure to eliminate hypotheses in a conceptual task." *Quarterly Journal of Experimental Psychology*, 12(3), 129-140.
2. Nickerson, R. S. (1998). "Confirmation bias: A ubiquitous phenomenon in many guises." *Review of General Psychology*, 2(2), 175-220.
3. Cawley, G. C., & Talbot, N. L. C. (2010). "On over-fitting in model selection and subsequent selection bias in performance evaluation." *Journal of Machine Learning Research*, 11, 2079-2107.
4. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). "Machine bias." *ProPublica*. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
5. Dastin, J. (2018). "Amazon scraps secret AI recruiting tool that showed bias against women." *Reuters*.
6. Hovy, D., & Prabhumoye, S. (2021). "Five sources of bias in natural language processing." *Language and Linguistics Compass*, 15(8), e12432.
7. Sharma, M., et al. (2023). "Towards understanding sycophancy in language models." *arXiv preprint arXiv:2310.13548*; presented at the International Conference on Learning Representations (ICLR), 2024. https://arxiv.org/abs/2310.13548
8. Roselli, D., Matthews, J., & Talagala, N. (2019). "Managing bias in AI." *Companion Proceedings of the 2019 World Wide Web Conference*, 539-544.
9. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). "A survey on bias and fairness in machine learning." *ACM Computing Surveys*, 54(6), 1-35.
10. Cerqueira, V., Torgo, L., & Mozetic, I. (2024). "Cherry-picking in time series forecasting: How to select datasets to make your model shine." *arXiv preprint arXiv:2412.14435*.
11. Navigli, R., Ferrara, A., & Ferraro, G. (2023). "Rolling in the deep of cognitive and AI biases." *arXiv preprint arXiv:2407.21202*.
12. NIST. (2022). "Towards a standard for identifying and managing bias in artificial intelligence." *NIST Special Publication 1270*.
13. Feuerriegel, S., et al. (2020). "Fair AI: Challenges and opportunities." *Business & Information Systems Engineering*, 62, 379-384.
14. Van Berkel, N., et al. (2024). "Confirmation bias in AI-assisted decision-making: AI triage recommendations congruent with expert judgments increase psychologist trust and recommendation acceptance." *Computers in Human Behavior: Artificial Humans*, 2(1).
15. Bakshy, E., Messing, S., & Adamic, L. A. (2015). "Exposure to ideologically diverse news and opinion on Facebook." *Science*, 348(6239), 1130-1132.
16. OpenAI. (2025). "Sycophancy in GPT-4o: What happened and what we're doing about it." https://openai.com/index/sycophancy-in-gpt-4o/