AI hallucinations in court filings
Last reviewed
Jun 7, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 2,600 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 7, 2026
Sources
22 citations
Review status
Source-backed
Revision
v1 ยท 2,600 words
Add missing citations, update stale details, or suggest a clearer explanation.
AI hallucinations in court filings are fabricated legal citations, quotations, case names, and facts that generative AI tools invent and that lawyers or self-represented litigants then submit to a court as if they were real. The problem became a recognized category of attorney misconduct after the 2023 case Mata v. Avianca, in which two New York lawyers used ChatGPT to write a brief that cited six nonexistent decisions and were sanctioned $5,000. Since then the phenomenon has spread to courts on multiple continents. By April 2026 a widely cited public database had logged more than 1,200 such cases worldwide, and judges had escalated penalties from token fines into the tens and even hundreds of thousands of dollars. The conduct has caught not only solo practitioners but large national firms and, in at least one instance, an expert working for an AI company in litigation about that company's own model. The core lesson courts keep repeating is simple and tool-agnostic: a lawyer who signs a filing must personally read and verify every authority in it, however that authority was found.
A large language model generates text by predicting plausible sequences of words, not by retrieving verified facts from a database. When asked for legal authority, a general-purpose chatbot will often produce output that looks exactly like a real citation: a case name, a reporter volume and page, a court, a year, and even quoted holdings. This behavior is a hallucination, the term of art for confident, fluent output that is partly or wholly fabricated. Because the fake citations follow the correct formatting conventions, they pass a casual glance and are difficult to spot without checking each one against a traditional reporter or a legal-research database such as Westlaw or LexisNexis.
The risk is structural rather than a bug in any single product. Studies of leading consumer models have found that they invent or misstate case law at meaningful rates when used for legal research without grounding. Specialized legal-research tools that retrieve real documents reduce, but do not eliminate, the problem. The duty to verify therefore falls on the human filer. Under Rule 11 of the United States Federal Rules of Civil Procedure, an attorney who signs a court paper certifies that its legal contentions are warranted by existing law after a reasonable inquiry. Submitting AI-fabricated authority violates that certification, and most reported sanctions rest on Rule 11, on a court's inherent authority, or on the equivalent rules of state courts and other countries.
The seminal incident arose from an ordinary personal-injury suit. Roberto Mata sued the airline Avianca over a knee injury he said a serving cart caused on an international flight. When Avianca moved to dismiss, Mata's lawyers at the New York firm Levidow, Levidow and Oberman filed an opposition brief citing six decisions as precedent. None of the six existed. Attorney Steven Schwartz had used ChatGPT to do the research and had even asked the chatbot whether the cases were real; it falsely assured him they were and produced fabricated excerpts of the opinions.
Avianca's counsel and the court were unable to locate the cited cases. Rather than withdraw the citations, the lawyers initially defended them, and Schwartz's colleague Peter LoDuca submitted an affidavit attaching purported copies of the decisions, which were themselves invented. United States District Judge P. Kevin Castel of the Southern District of New York found that the attorneys had acted in bad faith. In an order issued June 22, 2023, he sanctioned Schwartz, LoDuca, and their firm $5,000 jointly and severally under Rule 11, and directed them to notify each real judge to whom ChatGPT had falsely attributed a fabricated opinion. The case drew enormous press attention and became shorthand for the dangers of using a chatbot as a substitute for legal research.
A second high-profile episode that year reinforced the lesson with a different outcome. Michael Cohen, the former lawyer for Donald Trump, gave his attorney case citations he had generated with Google's Bard, believing it to be a kind of supercharged search engine. The citations were fake. In a December 2023 ruling, United States District Judge Jesse Furman declined to sanction Cohen himself, accepting that he had not understood that Bard was a generative tool, while making clear that the lawyer who filed the citations bore responsibility for verifying them.
After Mata v. Avianca, similar incidents multiplied and then accelerated sharply through 2025 and into 2026 as cheap, capable chatbots reached a mass audience. Reported cases came from federal trial and appellate courts, state courts, and tribunals abroad, and they involved plaintiffs, defendants, expert witnesses, and occasionally judicial staff. Some of the most striking patterns were the spread to large, well-resourced firms and the steady climb in penalty size as judges concluded that small fines were not deterring the conduct.
In February 2025, in Wadsworth v. Walmart, United States District Judge Kelly Rankin of the District of Wyoming sanctioned three lawyers, including attorneys from the large plaintiffs' firm Morgan and Morgan, after a motion cited nine cases, eight of them fabricated by the firm's internal AI tool. The lead attorney's pro hac vice admission was revoked and he was fined $3,000; two others were fined $1,000 each. In May 2025, in Lacey v. State Farm, retired Special Master Michael Wilner held the firms Ellis George and K and L Gates jointly liable for $31,100 after a brief built from an AI-generated outline contained fabricated authority that no one at either firm had checked. In July 2025, in the defamation case Coomer v. Lindell, United States District Judge Nina Wang sanctioned two lawyers for MyPillow founder Mike Lindell a total of $6,000 ($3,000 each) over a brief with nearly thirty defective or nonexistent citations.
The escalation continued at the appellate level. In Whiting v. City of Athens, decided in early 2026, a panel of the United States Court of Appeals for the Sixth Circuit imposed $30,000 in fines, $15,000 per attorney paid to the court, on two Tennessee lawyers whose briefs across consolidated appeals contained more than two dozen fake or misrepresented citations; the court also ordered them to reimburse the opposing side's appellate fees and pay double costs under 28 U.S.C. section 1927. Reporting described it as the steepest federal appellate sanction tied to fabricated citations on record. At the state level, a California court fined attorney Amir Mostafavi $10,000 in 2025 over an appeal with fabricated AI citations, among the largest penalties a California court had imposed for the conduct.
The current record came from Oregon. In an opinion dated December 2025, United States Magistrate Judge Mark D. Clarke of the District of Oregon ordered roughly $110,000 in combined fines and attorney fees against two lawyers, San Diego attorney Stephen Brigandi and Portland attorney Tim Murphy, in a family dispute over control of a winery. The briefs contained about fifteen citations to nonexistent cases and eight fabricated quotations. Judge Clarke wrote that the case was "a notorious outlier in both degree and volume" in the expanding universe of AI-misuse sanctions, dismissed the underlying claims with prejudice, and assigned the bulk of the penalty as attorney fees, with Brigandi bearing roughly $80,000 in fees plus about $15,000 in fines. As of mid-2026 it stood as the largest reported monetary penalty for AI hallucinations in a court filing.
Even AI companies have been touched. In the copyright suit Concord Music Group and others v. Anthropic, litigated in the Northern District of California in 2025, a declaration supporting Anthropic's defense relied on an expert report by an Anthropic data scientist that cited an article in a statistics journal which did not exist as described. An associate at Anthropic's law firm, Latham and Watkins, explained that she had used Anthropic's own model, Claude, to format the citation and that the model had supplied a false author and title while keeping the real link, volume, and year, an error her manual check failed to catch. United States Magistrate Judge Susan van Keulen called it "a very serious and grave issue," noting "a world of a difference between a missed citation and hallucination generated by AI," and the affected portion of the report was struck. The episode underscored that the problem is about process and verification, not the sophistication of any particular user.
| Case | Court | Year | Sanction |
|---|---|---|---|
| Mata v. Avianca | S.D.N.Y. (federal trial) | 2023 | $5,000 jointly; notice to misattributed judges |
| Cohen (Bard citations) | S.D.N.Y. (federal trial) | 2023 | No sanction on Cohen; conduct condemned |
| Wadsworth v. Walmart | D. Wyo. (federal trial) | 2025 | $5,000 total; lead counsel's pro hac vice revoked |
| Lacey v. State Farm | C.D. Cal. (special master) | 2025 | $31,100 against two firms, jointly |
| Coomer v. Lindell | D. Colo. (federal trial) | 2025 | $6,000 ($3,000 per lawyer) |
| Mostafavi (state appeal) | California state court | 2025 | $10,000 |
| Whiting v. City of Athens | 6th Cir. (federal appellate) | 2026 | $30,000 plus fees and double costs |
| Brigandi / Murphy (winery dispute) | D. Or. (federal trial) | 2025 to 2026 | About $110,000 in fines and fees (record) |
The growth of the phenomenon is documented most prominently by Damien Charlotin, a legal data analyst and research fellow associated with the Smart Law Hub at HEC Paris, who maintains a public, regularly updated AI Hallucination Cases Database. It catalogs court decisions worldwide in which a litigant or lawyer submitted AI-generated hallucinated content, and lets users filter by country, party, the AI tool involved, and the outcome. Because the count rises continuously, it is best stated as a dated snapshot. The database recorded about 87 cases in mid-May 2025 and roughly 486 by late October 2025. By early April 2026 it tracked about 1,227 cases globally, with the United States accounting for the large majority (reported around 811), at a pace of roughly five to six new cases per day. Other compilations, including the AI Incident Database, track individual episodes as well, and law firms and legal-technology vendors publish their own running tallies. The figures should be read as lower bounds, since they capture only incidents that produced a written, public decision; many filings are corrected or withdrawn before any opinion issues.
Judges and rulemakers have responded along several tracks. The earliest reaction was the standing order. On May 30, 2023, United States District Judge Brantley Starr of the Northern District of Texas issued what is often called the first such order, requiring every attorney before him to certify either that no part of a filing was drafted by generative AI or that any AI-drafted language was checked for accuracy by a human using traditional sources. Many individual judges across the country adopted comparable disclosure or certification requirements, though practice is far from uniform and some courts and commentators argued that existing Rule 11 duties already cover the conduct and that special AI rules are unnecessary.
Formal rulemaking followed. Some districts adopted local rules emphasizing that litigants remain fully responsible for the accuracy of AI-assisted filings, such as a provision in the Eastern District of Texas tying generative-AI use to Rule 11. At the national level, the Advisory Committee on Evidence Rules advanced a proposed Federal Rule of Evidence 707, approved for public comment in 2025, which would subject machine-generated evidence offered without a sponsoring expert to the same reliability standards that govern expert testimony under Rule 702. Rule 707 addresses AI evidence rather than fabricated citations directly, but it is part of the same effort by the courts to adapt to generative tools. Across these responses runs a consistent theme that several courts, including the Sixth Circuit, stated explicitly: the standard is tool-neutral, and the obligation to read and verify every cited authority is nondelegable regardless of whether a citation came from a chatbot, a junior associate, or a paralegal.
The rise of AI hallucinations in court filings is one of the first places where the limits of generative AI have produced concrete, repeated, and well-documented professional consequences. It illustrates a gap between how capable these systems appear and how reliable they actually are for high-stakes factual work, and it has become a standard example in debates over professional responsibility, legal ethics, and broader AI regulation. For the legal profession specifically, the episode has accelerated the adoption of verification protocols, AI-use policies at firms, and continuing-education requirements, while leaving unresolved how much disclosure courts should demand. More broadly, the pattern is a cautionary case study cited well beyond law: a powerful tool, used without verification by people who trusted its fluent output, generated confident falsehoods that slipped into the official record until an adversary or a judge checked the work.
Related: hallucination, large language model, generative AI, AI regulation, legal.