Hallucination in artificial intelligence (AI) refers to outputs produced by generative models that are fluent and plausible but not supported by the source, input, or external reality. In large language model (LLM) systems and multimodal models, hallucinations include fabricated facts, incorrect citations, or contradictions to provided context. Research distinguishes between faithfulness errors to a given input and factuality errors with respect to world knowledge, and documents that hallucination is prevalent across tasks such as abstractive summarization, question answering, dialogue, and vision–language reasoning.[1][2][3][4]
While rates depend on model, data, and task, a common view is that hallucinations remain persistent and difficult to fully eliminate in current probabilistic next-token prediction systems. They can be mitigated via grounding (for example RAG), training and decoding choices, post-hoc detection, and better evaluation protocols.[5][6][7][8][1]
Terminology and definitions
Hallucination (general): generation of content that is not supported by the source or reality, despite surface-level coherence. The term draws on psychology but in AI denotes a technical failure mode rather than a human-like perceptual phenomenon.[1]
Faithfulness vs. factuality: Faithfulness evaluates consistency with the given input (for example a source article), while factuality evaluates agreement with established external facts; a model can be faithful yet factually wrong if the input itself is wrong, or unfaithful (adding unsupported details) while remaining factually plausible.[2][1]
Intrinsic vs. extrinsic hallucinations: Intrinsic (input-contradicting) errors conflict with the source; extrinsic (unsupported) errors introduce unverifiable or new content not grounded in the source.[1]
Taxonomy
Surveys and position papers propose overlapping taxonomies along what goes wrong and why it goes wrong.[1][9][4]
Researchers evaluate hallucination with specialized benchmarks and metrics:
TruthfulQA: evaluates whether models avoid widely held misconceptions; many models mimic human falsehoods without explicit grounding.[3]
Human faithfulness annotation: in summarization, human studies reveal substantial hallucinated content across neural systems, highlighting the gap between ROUGE and factuality.[2]
Surveys and taxonomies compile intrinsic/extrinsic error rates and categorize metrics (for example entailment-based, QA-based, citation-/evidence-based).[1][9]
Automatic detection. Methods include:
Family
Example/Representative work
Evidence required
Notes
Consistency-based
SelfCheckGPT
No (uses multiple generations)
Flags sentences that vary across samples as likely hallucinated.[7]
Retrieval-verification
RAG + verifier
Yes (documents)
Cross-checks output against retrieved passages; supports provenance.[5]
Semantic-uncertainty
Semantic entropy
No (uses distributional signals)
Estimates uncertainty in meaning space to detect confabulations.[8]
NLI/entailment scoring
Claim vs. source
Optional
Scores faithfulness to context; common in summarization evaluation.[2]
Mitigation strategies
Multiple, complementary strategies are used in production systems:
Grounding via Retrieval-augmented generation (RAG): Retrieve relevant documents and condition generation on them; improves factuality and enables citation to sources when implemented with evidence tracing.[5]
Instruction tuning and Reinforcement learning from human feedback (RLHF): Aligns models to prefer helpful, harmless, and, importantly, more accurate outputs relative to base models on instruction-following tasks, reducing some classes of hallucinations though not eliminating them.[6]
Constrained generation and safer decoding: for example conservative nucleus sampling, beam search with reranking, and citation-required prompts to bias toward verifiable content.[9]
Detection-and-edit pipelines: Post-generation verifiers (consistency checks, entailment, retrieval-backed fact checkers) to edit or block ungrounded claims.[7][8]
Task and UI design: Encourage models to indicate uncertainty, request clarification, or provide sources, and route high-stakes queries to information retrieval or tools (calculators, code execution) instead of free-form text generation.[9][5]
Limitations and open problems
Surveys emphasize that (1) likelihood-based training does not directly optimize truth, (2) benchmarks incompletely cover real-world claims, (3) detection methods can miss subtle errors or over-flag creative content, and (4) multimodal models have additional failure modes from visual prior bias and imperfect perception.[1][9][4]
Etymology and public reception
The term hallucination had positive/technical uses in early computer vision (for example “face hallucination”) but shifted by the late 2010s to a negative connotation for factually incorrect outputs (for example in neural machine translation and vision under adversarial perturbations).[10][11][9][4] Reflecting widespread concern, Cambridge Dictionary selected “hallucinate” (the AI sense) as its 2023 Word of the Year.[12]
Notable incidents
Year
System/Domain
Description
Source
2023
Google Bard (now Gemini)
In a promotional demo, Bard gave an inaccurate claim about the James Webb Space Telescope; Alphabet shares fell sharply following coverage.
B.C. Civil Resolution Tribunal held the airline liable for negligent misrepresentation after its website chatbot provided incorrect refund advice; tribunal ordered compensation (C$650.88 plus interest and fees).
Multimodal and vision-language systems exhibit additional failure modes, including object hallucination and caption–image mismatch. A dedicated survey catalogs causes (language-prior dominance, weak grounding), evaluations, and mitigations in multimodal LLMs.[4]
Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A., & Fung, P. (2023). "Survey of Hallucination in Natural Language Generation." *ACM Computing Surveys*, 55(12), 1-38.
Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020). "On Faithfulness and Factuality in Abstractive Summarization." *Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics*, 1906-1919.
Lin, S., Hilton, J., & Evans, O. (2022). "TruthfulQA: Measuring How Models Mimic Human Falsehoods." *Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics*, 3214-3252.
Li, J., Cheng, X., Zhao, W. X., Nie, J.-Y., & Wen, J.-R. (2023). "HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models." *Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing*.
Min, S., Krishna, K., Lyu, X., Lewis, M., Yih, W., Koh, P. W., Iyyer, M., Zettlemoyer, L., & Hajishirzi, H. (2023). "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation." *Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing*.
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). "Self-Consistency Improves Chain of Thought Reasoning in Language Models." *arXiv preprint arXiv:2203.11171*.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Kuttler, H., Lewis, M., Yih, W., Rocktaschel, T., Riedel, S., & Kiela, D. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." *Advances in Neural Information Processing Systems*, 33, 9459-9474.
Magesh, V., Surani, F., Dahl, M., Suzgun, M., Manning, C. D., & Ho, D. E. (2024). "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools." *Journal of Legal Analysis*, 16, 64-93.
Cossio, M. (2025). "A Comprehensive Taxonomy of Hallucinations in Large Language Models." *arXiv preprint arXiv:2508.01781*.
Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2025). "A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions." *arXiv preprint arXiv:2311.05232*.
Rawte, V., Sheth, A., & Das, A. (2023). "A Survey of Hallucination in Large Foundation Models." *arXiv preprint arXiv:2309.05922*.
Tonmoy, S. M., Zaman, S. M., Jain, V., Rani, A., Rawber, A., Chadha, A., & Das, A. (2024). "A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models." *arXiv preprint arXiv:2401.01313*.