Perplexity AI copyright lawsuits
Last reviewed
Jun 8, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 2,151 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 2,151 words
Add missing citations, update stale details, or suggest a clearer explanation.
The Perplexity AI copyright lawsuits are a cluster of lawsuits, cease-and-desist demands, and public disputes brought against Perplexity AI by news publishers, reference publishers, and online platforms over how its retrieval-augmented generation "answer engine" gathers and reuses their content. Plaintiffs allege that Perplexity scrapes their websites, in some cases bypassing paywalls and ignoring the Robots Exclusion Protocol (robots.txt), and then reproduces or closely paraphrases their material in AI-generated answers, sometimes attaching false or "hallucinated" attributions to the publishers. The disputes form part of the broader wave of AI copyright litigation that also includes New York Times v. OpenAI and Authors Guild v. OpenAI, but they are distinguished by their focus on real-time search and answer generation rather than on the training of large foundation models. [1][2]
The earliest public conflicts began in June 2024, when Forbes and Conde Nast accused Perplexity of plagiarism. The first U.S. lawsuit, filed by News Corp's Dow Jones and New York Post in October 2024, was followed by suits from Encyclopaedia Britannica and Merriam-Webster (September 2025), Japan's Nikkei and Asahi Shimbun (August 2025), Reddit (October 2025), The New York Times and the Chicago Tribune (December 2025), and CNN (May 2026), alongside cease-and-desist letters from the BBC and others. [2][3]
Perplexity AI, founded in 2022 and led by chief executive Aravind Srinivas, markets a conversational "answer engine" that responds to user questions with synthesized summaries and inline citations. Rather than training its own frontier large language model, the company says it operates as an interface that retrieves current web pages and feeds them to underlying models from OpenAI, Anthropic, and Google, supplemented by an internal system the company has described as based on Meta's Llama. [4]
Publishers' core grievance is that this generative AI workflow substitutes for their own websites. When Perplexity answers a query by summarizing an article, the argument goes, users get the substance of the reporting without visiting the publisher's page, depriving the publisher of subscription, advertising, and licensing revenue. Critics further allege that Perplexity's crawlers ignore robots.txt directives and access paywalled material, and that its summaries sometimes misstate facts while attributing the errors to named outlets, damaging the publishers' credibility. [1][3]
The technical dimension of these complaints sharpened in August 2025, when the infrastructure company Cloudflare published research alleging that Perplexity used "stealth" undeclared crawlers, including a generic user agent designed to impersonate Google Chrome on macOS, to retrieve content from sites that had blocked its declared bot. Cloudflare said it created brand-new, unpublished test domains, blocked all crawlers via robots.txt, and still found that Perplexity could return their contents; it subsequently removed Perplexity from its list of verified bots. Perplexity called the analysis "technically flawed" and argued that Cloudflare misunderstood the difference between bulk crawling and "user-driven agents" that fetch a page in real time to answer one person's question. [5]
In June 2024, the business outlet Forbes accused Perplexity of republishing the substance of a proprietary Forbes investigation with little original reporting and inadequate attribution, an episode widely treated as the first major public controversy over the company's content practices. [3]
That same month, Wired and the independent researcher Robb Knight reported evidence that Perplexity disregarded robots.txt by using IP addresses not listed in its published ranges. Wired's parent company, Conde Nast, which also owns The New Yorker and Vogue, sent Perplexity a cease-and-desist letter demanding that it stop reproducing Conde Nast content. These accusations did not at the time become lawsuits, but they established the central factual themes, scraping despite no-crawl signals, verbatim or near-verbatim reuse, and unreliable attribution, that later complaints would echo. [3][6]
The disputes escalated into a series of formal legal actions, most filed in the U.S. District Court for the Southern District of New York. The table below summarizes the principal matters; details and citations follow.
| Plaintiff(s) | Date filed | Venue | Principal claims |
|---|---|---|---|
| Dow Jones and NYP Holdings (News Corp) | Oct. 21, 2024 | S.D.N.Y. | Copyright infringement; false attribution |
| Nikkei and The Asahi Shimbun | Aug. 26, 2025 | Tokyo District Court | Copyright infringement (Japan) |
| Encyclopaedia Britannica and Merriam-Webster (Britannica Group) | Sept. 10, 2025 | S.D.N.Y. | Copyright and trademark infringement |
| Oct. 22, 2025 | S.D.N.Y. | DMCA anti-circumvention; unjust enrichment; unfair competition | |
| The New York Times and Chicago Tribune (separate suits) | Dec. 5, 2025 | S.D.N.Y. | Copyright infringement; false attribution |
| Cable News Network (Warner Bros. Discovery) | May 28, 2026 | S.D.N.Y. | Copyright and trademark infringement |
On October 21, 2024, News Corp units Dow Jones and Company (publisher of The Wall Street Journal) and NYP Holdings (publisher of the New York Post) sued Perplexity in the Southern District of New York. The complaint alleged that Perplexity reproduced their copyrighted content, sometimes verbatim and without crediting or linking to the source, and that it generated "hallucinated" passages falsely attributed to the publishers. The plaintiffs said they had sent Perplexity a letter in July 2024 raising the issues and offering licensing talks, and that Perplexity did not respond. They sought an injunction and statutory damages of up to $150,000 per infringement. Perplexity replied in a blog post that it was "disappointed and surprised," characterizing the suit as reflecting a "shortsighted, unnecessary and self-defeating" adversarial posture between media and technology companies, and it later accused News Corp of trying to "entrap" its chatbot to manufacture a case. [1][7]
On September 10, 2025, the Britannica Group, publisher of Encyclopaedia Britannica and Merriam-Webster, filed a copyright and trademark suit in the Southern District of New York. It alleged that Perplexity used a crawler identified as "PerplexityBot" to scrape its sites and that the answer engine reproduced its human-verified articles, often verbatim, while diverting traffic to AI summaries. The trademark claims contended that Perplexity linked the Britannica and Merriam-Webster brands to inaccurate AI-generated results, harming the publishers' reputation for reliability. [2][8]
On August 26, 2025, the Japanese newspapers Nikkei and The Asahi Shimbun jointly sued Perplexity in the Tokyo District Court. They alleged that since at least June 2024 Perplexity copied and stored article content from their servers, circumvented a technical measure intended to prevent this, and produced inaccurate information attributed to their reporting. Each newspaper sought an injunction and damages of 2.2 billion yen (roughly $15 million). [9]
On October 22, 2025, Reddit sued Perplexity together with three data-scraping intermediaries, Oxylabs, AWMProxy, and SerpApi, in the Southern District of New York. Rather than asserting ordinary copyright claims, Reddit alleged violations of the anti-circumvention provisions of the Digital Millennium Copyright Act (DMCA), together with unjust enrichment and unfair competition. The complaint described a test in which Reddit created a post indexable by Google but not otherwise accessible; according to the suit, Perplexity's answer engine surfaced material portions of that post within hours, which Reddit cited as evidence that the content had been scraped from search results and resold. Reddit sought damages, a permanent injunction, and a ban on the use or sale of previously scraped data. [10][11]
The New York Times escalated a long-running dispute into litigation. It had sent Perplexity cease-and-desist notices in October 2024 and again in 2025 demanding that the company stop accessing its content, and it said the two sides negotiated for more than 18 months without reaching a license. On December 5, 2025, the Times sued Perplexity in the Southern District of New York, alleging "large-scale, unlawful copying and distribution" of its journalism, real-time scraping, verbatim reproduction in outputs, and false attribution. The Chicago Tribune filed a separate but parallel suit in the same court the same day, alleging that Perplexity scraped and reproduced its paywalled articles nearly word for word within its retrieval-augmented system. [12][13]
On May 28, 2026, Cable News Network, owned by Warner Bros. Discovery, sued Perplexity in the Southern District of New York (Cable News Network, Inc. v. Perplexity AI, Inc., No. 1:26-cv-04427). The complaint alleged that Perplexity unlawfully copied more than 17,000 CNN stories, videos, and images, asserting direct, contributory, and vicarious copyright infringement under 17 U.S.C. Sections 106(1) and 106(2), plus trademark infringement, false designation of origin, and trademark dilution. CNN said the parties had negotiated a "Comet Plus" partnership term sheet effective October 1, 2025, but terminated it on November 24, 2025, over disputes about content-usage limits. As specific evidence, CNN alleged that Perplexity's Comet browser assistant reproduced verbatim text from a paywalled CNN article by Stephen Collinson published February 9, 2026, despite the page displaying a paywall, and that Perplexity did not respond to a cease-and-desist letter sent December 10, 2025. A Perplexity spokesperson responded that "you can't copyright facts." [14][15]
Perplexity has consistently denied wrongdoing and advanced a recurring set of defenses. It argues that it does not train foundation models on scraped data but instead indexes web pages and surfaces factual content as cited citations when a user asks a question, and that facts themselves are not protected by copyright. When the BBC threatened legal action in June 2025 over verbatim reproduction and inaccurate summaries, Perplexity called the broadcaster's claims "manipulative and opportunistic" and accused it of a "fundamental misunderstanding" of technology and intellectual-property law, stressing that it serves as an interface to third-party models rather than a model trainer. [4][16]
Alongside these legal arguments, Perplexity has pursued commercial accommodation with publishers. It operates a publisher partnership program and in August 2025 launched Comet Plus, a $5-per-month subscription tied to its Comet browser that pools subscription revenue and pays out 80 percent to participating publishers (Perplexity retaining 20 percent), with payments allocated across direct traffic, citations in answers, and agent-driven use. Early participants and revenue-share partners have included Time, Fortune, Der Spiegel, Gannett, and The Independent. The company frames such programs as a constructive alternative to litigation, while critics note that the largest plaintiffs declined the terms. [16][17]
The Perplexity cases sharpen a question that earlier AI-copyright suits left partly open: whether real-time retrieval and answer generation, as opposed to model training, can itself constitute infringement. Because Perplexity often reproduces or closely paraphrases material in the moment a user reads it, plaintiffs argue the harm to their businesses is more direct and immediate than the diffuse harm alleged from training. The Reddit suit further extends the legal frontier by relying on the DMCA's anti-circumvention rules and by naming scraping intermediaries, signaling that platforms may pursue the data-supply chain rather than only the AI company at its end. [10][11]
Taken together, the disputes test how the doctrines of copyright, trademark, unjust enrichment, and computer-access law apply to AI search, and how the long-standing principle that facts are not copyrightable interacts with the wholesale reuse of the expression that conveys those facts. Their outcomes, alongside parallel cases such as New York Times v. OpenAI, are likely to shape the economics of generative AI search and the terms on which AI companies must license or compensate the publishers whose work they summarize. [2][12]