Authors Guild v. OpenAI
Last reviewed
May 19, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,439 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 19, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,439 words
Add missing citations, update stale details, or suggest a clearer explanation.
Authors Guild et al. v. OpenAI, Inc., et al. is a putative class action copyright lawsuit filed in the United States District Court for the Southern District of New York (SDNY) on September 19, 2023, by The Authors Guild and seventeen prominent fiction writers against OpenAI and several of its affiliated corporate entities.[^1][^2] The case, docketed as No. 1:23-cv-08292, is one of the foundational author-led legal challenges to generative artificial intelligence and alleges that OpenAI engaged in "mass-scale copyright infringement" by copying the plaintiffs' books, including ebooks acquired from pirated "shadow libraries," to train the large language models that power ChatGPT and the GPT series.[^1][^3]
The complaint became part of a much larger wave of litigation against OpenAI by authors, news organizations, and other rightsholders. In April 2025, the U.S. Judicial Panel on Multidistrict Litigation centralized twelve actions before U.S. District Judge Sidney H. Stein in the SDNY as MDL No. 3143, In re: OpenAI, Inc., Copyright Infringement Litigation.[^4][^5] On October 27, 2025, Judge Stein issued a high-profile opinion denying OpenAI's motion to dismiss the consolidated authors' direct copyright infringement claim, with a discussion comparing ChatGPT-generated summaries of George R.R. Martin's A Game of Thrones to the underlying copyrighted novel.[^6][^7]
The case is closely watched as a defining test of whether training generative AI models on unlicensed copyrighted books, and producing outputs derived from those books, constitutes copyright infringement or qualifies as fair use under U.S. law. It is also frequently compared with parallel actions, including New York Times v. OpenAI, Bartz v. Anthropic (which settled in 2025 for approximately $1.5 billion), and the Northern District of California cases initially filed by Paul Tremblay, Sarah Silverman, and Richard Kadrey.[^8][^9][^10]
The Authors Guild is the United States' oldest and largest professional organization for writers. It was founded in 1912 as the Authors League of America, an alliance of book authors, magazine writers, and dramatists, before splitting from the Dramatists Guild in 1921 to focus specifically on book and magazine writers.[^11] Headquartered in New York City, the organization reports a membership of more than 9,000 writers and provides services including legal advice, contract review, advocacy on copyright and free expression, and educational resources.[^11][^12]
The Guild's CEO is Mary Rasenberger, and Maya Shanbhag Lang served as its president during the period in which the OpenAI complaint was filed; Lang is herself one of the seventeen named plaintiffs.[^11][^1] The Authors Guild has previously been at the center of major copyright litigation, including its long-running suits against Google over the Google Books scanning project and against HathiTrust, both of which resulted in fair use rulings adverse to the Guild.[^11] In its public statements on the OpenAI complaint, the Guild characterizes unauthorized use of copyrighted books for AI training as a threat that "will ultimately result in a shrinking of the profession as fewer human authors will be able to sustain a living."[^12]
In addition to the Authors Guild itself, the September 2023 complaint identifies seventeen individual authors as named plaintiffs, all of whom are bestselling or critically acclaimed writers of fiction holding registered U.S. copyrights in their works:[^1][^13]
The plaintiffs represent a putative class of professional fiction authors whose copyrighted works were allegedly used to train OpenAI's GPT models.[^1][^3]
The complaint, and the December 2023 amended complaint, name multiple OpenAI corporate entities reflecting the company's capped-profit structure. The defendants identified in subsequent filings include OpenAI Inc., OpenAI LP, OpenAI LLC, OpenAI GP LLC, OpenAI OpCo LLC, OpenAI Global LLC, OAI Corporation LLC, OpenAI Holdings LLC, OpenAI Startup Fund I LP, OpenAI Startup Fund GP I LLC, and OpenAI Startup Fund Management LLC.[^3] On December 4, 2023, the Authors Guild filed an amended complaint adding Microsoft Corporation as a defendant, reflecting Microsoft's role as OpenAI's principal investor, infrastructure provider, and commercial deployer of GPT models through Bing Chat (later rebranded Copilot) and other products.[^1][^14]
The amended complaint allocates the various counts across the corporate entities. According to legal analyses summarizing the amended complaint, direct infringement is alleged against OpenAI OpCo LLC; vicarious infringement is alleged against OpenAI Inc. and OpenAI GP LLC; and contributory infringement is alleged against OpenAI LLC, OpenAI Global LLC, OAI Corporation LLC, OpenAI Holdings LLC, OpenAI Startup Fund I LP, OpenAI Startup Fund GP I LLC, OpenAI Startup Fund Management LLC, and Microsoft.[^3]
The 47-page complaint, filed September 19, 2023, alleges that OpenAI engaged in "systematic theft on a mass scale" by copying the plaintiffs' books wholesale and using them as training material for its GPT models, specifically GPT-4 and the family of models that power ChatGPT.[^15][^16] The plaintiffs frame the conduct as a deliberate decision by OpenAI to build a multibillion-dollar commercial product using authors' copyrighted works without permission, credit, or compensation.[^1][^15]
The complaint identifies two main vectors by which the plaintiffs' books allegedly entered OpenAI's training corpus:
The complaint demonstrates "actual copying" by, among other things, prompting ChatGPT to produce detailed plot summaries, character lists, and proposed sequel outlines for plaintiffs' novels. The Guild argues that the specificity of these outputs reveals that the underlying works were ingested verbatim during training, and that the outputs themselves are unauthorized derivative works.[^1][^7]
The original Authors Guild complaint, and the December 2023 amended complaint adding Microsoft, asserted multiple federal copyright claims. Public summaries by counsel and by court filings identify the principal causes of action as:[^3][^13]
The plaintiffs seek statutory damages under the Copyright Act, actual damages, disgorgement of OpenAI's profits, and a permanent injunction barring OpenAI from continued unauthorized use of the plaintiffs' works, including potential destruction of infringing datasets and model weights derived from them.[^15][^16] Counsel for the Authors Guild plaintiffs, partner Rachel Geman of Lieff Cabraser Heimann & Bernstein, framed the stakes by stating that "without Plaintiffs' and the proposed class' copyrighted works, Defendants would have a vastly different commercial product."[^1]
OpenAI has consistently taken the position that training generative AI models on copyrighted text, including books, qualifies as fair use under 17 U.S.C. § 107. The company argues that model training is a transformative, non-expressive analytical use of the underlying corpus, that the model does not store or "republish" individual works, and that the resulting outputs do not substitute for the originals in the marketplace.[^18][^19]
In its public statements about parallel litigation, including the New York Times v. OpenAI action, OpenAI has emphasized that "training AI models using publicly available internet materials is fair use" and characterized examples of allegedly verbatim output as the result of adversarial prompting rather than ordinary use.[^18]
In motion practice in the consolidated MDL, OpenAI also raised arguments grounded in pleading deficiencies and statutes of limitation. It contended that plaintiffs had not adequately pleaded substantial similarity between specific ChatGPT outputs and the plaintiffs' protected expression and, in companion cases, attacked DMCA section 1202 claims premised on the alleged removal of copyright management information.[^20][^7]
Judge Stein's October 2025 opinion explicitly bracketed the fair use question, observing that "nothing in this opinion is intended to suggest a view on whether the allegedly infringing outputs are protected as fair uses of the original works."[^6] Fair use is therefore expected to be litigated, if at all, at summary judgment or trial, in line with the way it has been treated in Bartz v. Anthropic and Kadrey v. Meta.[^9][^10]
By early 2024, the Authors Guild action was one of more than a dozen copyright actions pending against OpenAI in the SDNY and the Northern District of California, including the Tremblay, Silverman, Kadrey, Chabon, and New York Times cases.[^4][^22]
The parties entered into stipulations governing the early phases of the SDNY litigation. According to docket entries, on January 19, 2024, the defendants stipulated not to bring a Rule 12(b) motion to dismiss the then-pleaded claims and not to invoke the first-to-file rule, allowing the case to proceed past the initial pleading stage without preliminary motion practice. The parties also stipulated to consolidate the Authors Guild action with Alter v. OpenAI (No. 1:23-cv-10211) for pretrial purposes.[^23]
Plaintiffs in the California cases sought to intervene in the SDNY actions to argue for transfer, stay, or dismissal under the first-to-file rule. On April 1, 2024, Judge Stein denied those motions to intervene, keeping the SDNY actions on track in New York.[^23]
In a key passage, Judge Stein analyzed an example in which ChatGPT generated a detailed summary of George R.R. Martin's A Game of Thrones and characterized the output as one that "a discerning observer could easily conclude . . . is substantially similar to Martin's original work because the summary conveys the overall tone and feel of the original work by parroting the plot, characters, and themes of the original."[^6][^7] The court distinguished its prior treatment of news-article summaries in The New York Times v. Microsoft, where the summaries had been found to track non-copyrightable facts rather than protected expression.[^7]
Following the motion to dismiss ruling, the consolidated MDL moved into a contentious discovery phase. A major dispute concerned plaintiffs' request that OpenAI produce a sample of ChatGPT user logs to identify potentially infringing outputs.
As of mid-May 2026, Authors Guild v. OpenAI, as part of MDL No. 3143, remains an active case in the SDNY before Judge Stein. The consolidated authors' complaint has survived OpenAI's motion to dismiss. The litigation is in a discovery and pre-class-certification phase, with the parties having stipulated that discovery on class issues may proceed in parallel with discovery on the merits, and summary judgment expected to be briefed in advance of any motion for class certification.[^23][^26] No class has been certified, and no merits judgment has been entered. Fair use, the central substantive defense, has been preserved for later stages.[^6]
Authors Guild v. OpenAI is one of several closely tracked copyright actions against generative AI companies that together are shaping U.S. law on training-data infringement. Several of the most important comparators are summarized below.
New York Times v. OpenAI, filed in the SDNY on December 27, 2023, is the most prominent news-publisher action in the consolidated MDL. The Times alleged that OpenAI and Microsoft trained GPT models on millions of its articles and that ChatGPT can be induced to output portions of Times articles verbatim or to summarize copyrighted investigative reporting.[^18][^19] In his April 4, 2025 opinion, Judge Stein allowed direct infringement and contributory infringement claims to proceed against OpenAI and Microsoft, while dismissing certain DMCA and unfair competition claims.[^20] In his October 2025 authors' opinion, Judge Stein expressly distinguished how he had treated news-article summaries in the Times case (as recitations of non-copyrightable facts) from how he treated novel summaries in the authors' case (as substantially similar to protected expression).[^7]
Bartz v. Anthropic is a parallel author class action filed in the Northern District of California against Anthropic, brought by nonfiction authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson and represented in part by Susman Godfrey. In June 2025, Judge William Alsup ruled on summary judgment that Anthropic's use of legitimately acquired books to train its Claude models was a transformative fair use, but separately held that Anthropic's downloading and retention of pirated books from shadow libraries was not fair use.[^9][^10]
In September 2025, Anthropic agreed to settle the Bartz class action for approximately $1.5 billion, widely reported as the largest copyright settlement in U.S. history. The settlement provides for payments of roughly $3,000 per covered work, the destruction of pirated copies of the books and any direct derivatives, and a final fairness hearing scheduled for 2026.[^8][^27] As of May 2026, the Bartz settlement remains pending final court approval.[^8]
The contrast between the Bartz outcome and the still-pending Authors Guild case is one of the principal frames used by commentators discussing the latter: Anthropic chose to settle rather than litigate damages after Judge Alsup distinguished training from piracy, while OpenAI has so far chosen to litigate, with Judge Stein concluding that the plaintiffs have at least pleaded a viable infringement case and reserving the fair use question for later.[^6][^9]
In June 2023, before the Authors Guild complaint, novelist Paul Tremblay and comedian and author Sarah Silverman filed companion class actions against OpenAI in the Northern District of California (the Tremblay and Silverman cases), later joined by Michael Chabon and other literary authors in a Chabon action. These complaints raised similar allegations about training on pirated books. In February 2024, Judge Araceli Martínez-Olguín dismissed several of the ancillary claims (including vicarious infringement, DMCA section 1202, negligence, and unjust enrichment counts) while allowing direct infringement to proceed and granting leave to amend.[^22] In 2025, these California author actions were transferred to and consolidated with the SDNY MDL before Judge Stein, joining the Authors Guild and Alter cases.[^4][^5]
A separate California suit against Anthropic by some of the same plaintiffs, Concord Music Group v. Anthropic, addresses song lyrics rather than books; it is summarized separately in Concord Music Group v. Anthropic.
Kadrey v. Meta Platforms (Northern District of California) involves similar allegations against Meta's LLaMA models and the use of pirated book datasets including Books3. In June 2025, Judge Vince Chhabria granted summary judgment to Meta on the specific records before him, finding that the plaintiffs had failed to point to specific LLaMA outputs that reproduced their works in a manner sufficient to show infringement, while declining to bless training on pirated data as fair use in general terms.[^28]
Although Authors Guild v. OpenAI has not yet produced a merits ruling, its procedural milestones already carry significant implications.
Judge Stein's October 2025 ruling concluded that the consolidated authors' complaint adequately alleged both that OpenAI engaged in actual copying of the plaintiffs' works during training and that specific ChatGPT outputs are substantially similar to protected expression in those works.[^6][^7] These findings, although made on a motion to dismiss rather than at trial, are notable because they reject OpenAI's argument that LLM outputs are necessarily non-infringing reformulations and confirm that infringing-output theories of liability against LLM providers can survive the pleading stage in the Second Circuit.[^7][^25]
The Authors Guild complaint, like Bartz v. Anthropic and Kadrey v. Meta, foregrounds the use of pirated book repositories such as LibGen, Z-Library, Bibliotik, and Books3 in AI training data.[^17][^9] The combination of Judge Alsup's Bartz summary judgment ruling (which separated training from piracy) and Anthropic's subsequent $1.5 billion settlement has reinforced the centrality of piracy-based theories in author litigation.[^8][^9] OpenAI's continued litigation of the same theory in Authors Guild is therefore widely viewed as a test of whether comparable exposure attaches when, as the Authors Guild plaintiffs allege, an AI developer trained on similar datasets.[^15][^25]
The proposed classes in Authors Guild and Alter cover, respectively, fiction and nonfiction authors whose registered, copyrighted books were used in training OpenAI's GPT models. If certified, the classes could number tens of thousands of authors and far more works.[^21][^23] U.S. copyright law's statutory damages range of $750 to $150,000 per infringed work creates, as in Bartz, the theoretical potential for very large aggregate exposure, a fact that has been highlighted by both plaintiffs' counsel and independent commentators.[^8][^25]
The November 2025 and January 2026 rulings ordering OpenAI to turn over 20 million de-identified ChatGPT user logs introduced a novel privacy and discovery dynamic into AI copyright cases. The logs are intended to enable plaintiffs to identify outputs that reproduce or closely paraphrase their copyrighted works.[^26] The order has provoked broader commentary about the discoverability of AI chat histories in litigation and the design of de-identification regimes to protect end-user privacy while still permitting infringement detection.[^26]
Throughout the litigation, the Authors Guild has argued that absent legal accountability, AI training on unlicensed books will erode authors' livelihoods. Guild leadership has urged the development of voluntary or compulsory licensing frameworks for AI training and has held up the Bartz v. Anthropic settlement and the survival of Authors Guild v. OpenAI as evidence that the legal landscape is moving toward such frameworks.[^12][^8]
Authors Guild v. OpenAI is one of the principal vehicles through which U.S. courts are confronting the copyright implications of training large language models on copyrighted works. The case is significant for several intersecting reasons:
The ultimate disposition of Authors Guild v. OpenAI, whether by trial, summary judgment, or settlement, will be a major data point for AI policy and copyright law worldwide.