AI in journalism
Last reviewed
May 30, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,900 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 30, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,900 words
Add missing citations, update stale details, or suggest a clearer explanation.
AI in journalism refers to the use of artificial intelligence, and especially machine learning and generative AI, in the gathering, production, distribution and verification of news. The field predates the recent wave of large language model tools by more than a decade: news organisations began publishing computer-written articles from structured data in the early 2010s, using rule-based templates to turn corporate earnings figures, sports results and seismic readings into short reports. The arrival of capable text-generating models from 2020 onward broadened the range of tasks open to automation, from drafting and summarising to translation and headline testing, while also introducing new risks around accuracy, attribution, copyright and public trust.
Surveys of the news industry consistently find two patterns. First, newsrooms have adopted AI fastest for back-end and routine work such as transcription, copy editing, data analysis and the automated coverage of repetitive, structured events, where human editors remain in control of what is published. Second, audiences are markedly less comfortable with news produced wholly or mostly by machines than with news produced by people who use AI as an aid. The Reuters Institute for the Study of Journalism reported in 2025 that, across six countries, only 12 percent of people said they were comfortable with news made entirely by AI, rising to 62 percent for news made entirely by a human journalist.[1]
Before generative models, automated journalism relied on natural language generation systems that mapped structured data onto pre-written sentence templates. These systems did not understand the underlying events; they selected phrasing according to rules and inserted figures, names and outcomes drawn from a database. The approach worked best for domains with clean, frequent, structured data and a predictable narrative shape, such as financial results, sports box scores and weather.
Two deployments from 2014 are widely cited as the start of automation at scale in major newsrooms. In March 2014, the Los Angeles Times published a short report on a local earthquake within minutes of the event, written by an in-house program called Quakebot. Quakebot, created by Times journalist and programmer Ken Schwencke, drew on alerts from the United States Geological Survey's Earthquake Notification Service and automatically drafted a brief story whenever a quake above a chosen magnitude threshold was reported, leaving a human editor to review and publish it.[2]
Also in 2014, the Associated Press began using software from the firm Automated Insights, branded Wordsmith, to generate corporate earnings reports from data supplied by partners such as Zacks Investment Research. AP later said the system increased the volume of earnings stories it produced from a few hundred per quarter to roughly 3,700, an increase of about twelvefold, and that automation freed up an estimated 20 percent of the time its journalists had spent on earnings coverage, allowing them to cover more companies and to focus on higher-value reporting.[3][4] Competing vendors of the same era included Narrative Science, whose Quill platform produced narratives for clients in finance and elsewhere, and later European firms such as United Robots that supplied automated local-sports and real-estate text to regional publishers.
These early systems were narrow by design. They could not answer questions, write analysis, or handle unstructured material, and their output was confined to domains where a template could be trusted to be correct as long as the input data were correct. That reliability, which followed from their rule-based nature, stands in contrast to the more capable but error-prone generative systems that came later.
Contemporary newsrooms apply AI across a wide range of editorial and operational tasks. The table below summarises the main areas and how AI is typically used in each.
| Application area | Typical AI role |
|---|---|
| Automated reporting on structured data | Generating short articles on earnings, sports, elections and similar events directly from databases |
| Transcription | Converting interview and press-conference audio to text, often as a first draft for reporters |
| Translation | Producing draft translations of articles and source material for cross-border coverage |
| Summarisation | Condensing long documents, live blogs or article bodies into briefs, bullet points or push notifications |
| Headline and SEO optimisation | Suggesting and A/B testing headlines, and tailoring metadata for search and social distribution |
| Comment moderation | Flagging abusive or off-topic reader comments for review or removal |
| Personalisation and recommendation | Selecting and ordering stories for individual readers and newsletters |
| Investigative data analysis | Searching, classifying and clustering large leaked or public datasets and documents |
| Fact-checking and verification | Detecting check-worthy claims, matching them to prior fact-checks, and helping spot manipulated media |
In practice these uses cluster at the back end of the newsroom. A survey of news executives reported that the tasks most often prioritised were back-end automation such as transcription and copy editing, followed by recommender systems and, less commonly, content creation under human oversight.[5] Reuters Institute research likewise found that, while journalists increasingly used generative AI for research, idea generation, drafting parts of articles and verification, automated end-to-end story generation remained a minority practice and a source of audience scepticism.[6]
Several application areas draw directly on natural language processing techniques. Transcription depends on automatic speech recognition; summarisation, translation and drafting depend on generative language models; and comment moderation and claim detection rely on text classification. Investigative teams have used machine learning to triage very large document sets, a notable early example being the International Consortium of Investigative Journalists' use of machine classification to sift millions of files in leak-based projects.
The table below lists some of the better-documented deployments. Figures are as reported by the organisations or by the press, and are attributed to their sources.
| Organisation | Tool or system | Reported use |
|---|---|---|
| Associated Press | Wordsmith (Automated Insights) | Automated corporate earnings reports from 2014; raised earnings output to about 3,700 stories per quarter[3] |
| Los Angeles Times | Quakebot | Automated short earthquake reports from USGS data, used since 2014[2] |
| Washington Post | Heliograf | In-house system launched in 2016; used for Rio Olympics results and election and high-school sports coverage[7] |
| Bloomberg | Cyborg | Automation that the company has said helps produce a large share of its corporate-results coverage[8] |
| Reuters | Lynx Insight | Data-analysis assistant that flags anomalies and trends in datasets to suggest stories to reporters[9] |
| Press Association (UK) | RADAR | Local-news service combining reporters with automation to produce data-driven regional stories |
The Washington Post introduced Heliograf during the 2016 Rio Olympics to publish short updates such as medal results, and later extended it to election results and high-school football. The Post reported that Heliograf produced on the order of 850 articles in its first year of operation.[7] Bloomberg has described an automation system known as Cyborg that assists with financial reporting; the company has said that a substantial portion of its corporate-results stories are produced with such automation, with journalists adding context.[8] Reuters built Lynx Insight, which the agency has described as a tool to surface anomalies and trends in data for reporters to investigate rather than a system that publishes on its own, reflecting a broadly held principle that automation augments rather than replaces editorial judgment.[9]
The expansion of generative AI into news has been accompanied by a series of documented failures and disputes. Because language models generate fluent text without a built-in model of truth, they can produce confident but false statements, a behaviour commonly described as hallucination. In a news context, where accuracy and trust are central, such errors and the disclosure practices around AI have repeatedly become news in their own right.
In January 2023 the technology site CNET, then owned by Red Ventures, was reported by the outlet Futurism to have quietly published dozens of personal-finance explainer articles produced with an AI tool under the byline "CNET Money Staff," with the use of automation disclosed only in fine print. After errors were identified, including a basic mistake about compound interest, CNET's editor-in-chief said the site had reviewed the articles and issued corrections; CNET stated that corrections were made to 41 of the 77 affected stories.[10][11] The episode prompted wider scrutiny of undisclosed AI content and contributed to CNET's removal from Wikipedia's list of generally reliable sources.
In November 2023, Futurism reported that Sports Illustrated had published articles attributed to apparently fabricated authors whose biographies and headshots matched images sold on an AI-headshot marketplace. The magazine's then-publisher, The Arena Group, said the content had come from a third-party provider, AdVon Commerce, that the bylines were pen names, and that it was removing the articles and ending the relationship; The Arena Group disputed aspects of the report.[12][13] The case became a prominent example of concerns about AI-generated authors and the absence of clear disclosure.
In December 2024 and January 2025, the Apple Intelligence notification-summary feature, which condenses grouped notifications including news alerts on recent iPhones, generated several false summaries of news headlines. Documented errors included a summary that wrongly suggested a murder suspect had taken his own life, a summary stating a darts player had won a championship before the final was played, and a summary that misrepresented a story so as to make a false claim about the tennis player Rafael Nadal.[14] The BBC complained that summaries carrying its branding were inaccurate, and the press-freedom group Reporters Without Borders called for the feature to be withdrawn. In January 2025, Apple said it would temporarily disable notification summaries for news and entertainment apps in its iOS 18.3, iPadOS 18.3 and macOS 15.3 updates while it worked on improvements, and that summaries would be more clearly labelled as AI-generated.[15]
The training of large language models on news archives has produced significant litigation and a parallel wave of licensing. The most closely watched case is New York Times v. OpenAI, filed by The New York Times Company against OpenAI and Microsoft on 27 December 2023, which alleges that the companies copied millions of Times articles to train models and that those models can reproduce Times content and attribute fabricated statements to the paper. In April 2025, the court denied most of the defendants' motions to dismiss, allowing core copyright claims to proceed toward trial.[16] OpenAI has maintained that training on publicly available material is fair use.
Alongside the lawsuits, many publishers have signed content-licensing agreements with AI developers. OpenAI's first such deal, with the Associated Press, was announced in July 2023, and was followed by agreements with Axel Springer (publisher of Politico, Business Insider, Bild and Welt), the Financial Times, Time, Condé Nast and News Corp, among others; several deals reportedly fold publisher content into chatbot answers with attribution and links.[17] These arrangements remain contested within the industry, with critics arguing they trade long-term audience relationships for one-off payments and supporters arguing they provide revenue and attribution that scraping does not.
Beyond specific incidents, three broader concerns recur. The first is the impact on journalism jobs and the risk that cost pressures lead publishers to substitute automated or lightly edited AI content for reporting, a concern sharpened by cases such as Sports Illustrated. The second is the use of generative tools to manufacture disinformation at scale, including text, fabricated images and audio, and deepfake video, which complicates both verification and public confidence in genuine reporting. The third is erosion of trust: Reuters Institute research found that only about a third of people believed journalists "always" or "often" check AI outputs before publishing, and that this perception was strongly correlated with overall trust in news.[1] Together these concerns explain why most newsroom guidelines emphasise human oversight and disclosure.
In response to these risks, many news organisations published internal guidelines on AI use, particularly from mid-2023 onward. A study comparing policies across dozens of news organisations found broad agreement on principles such as human oversight, transparency and caution with generative output, but considerable variation in specifics, including how, or whether, AI involvement should be disclosed to readers.[18]
The Associated Press, which updated its influential stylebook to address AI, told staff that generative tools should not be used to create publishable content or images for its news report and that any output from a generative AI tool should be treated as unvetted source material.[19] The Guardian said it would use generative AI only where there was clear evidence of a specific benefit, with human oversight and the explicit permission of a senior editor, describing the technology as exciting but unreliable.[20] Wired published a policy stating that it would not publish stories with text generated by AI, while allowing limited uses such as suggesting headlines or social copy and generating story ideas.[21] A common thread across such policies is a commitment to disclose AI's role in published work, although the precise form of disclosure, from bylines and editor's notes to general statements of practice, is rarely standardised.
Fact-checking organisations have adopted AI in a more targeted way. Tools such as those built by the UK charity Full Fact and the academic ClaimBuster project detect check-worthy claims in speeches and text and match them against prior fact-checks, helping human checkers prioritise. Full Fact has said its AI tools are used by dozens of fact-checking organisations across many countries.[22] At the same time, fact-checkers describe AI as both ally and adversary, since the same generative capabilities that assist verification also make convincing falsehoods cheaper to produce.
The trajectory of AI in journalism points toward deeper integration in back-end and assistive roles, paired with continued caution about fully automated publishing. Industry surveys show rising prioritisation of automation for transcription, editing and recommendation, and growing experimentation with generative tools for research and drafting, even as executives and editors stress human responsibility for accuracy.[5][6] On the audience side, the persistent gap between comfort with human-led and machine-led news suggests that visible human authorship and clear disclosure will remain important to maintaining trust.[1]
Several open questions will shape the field. The outcome of copyright litigation such as New York Times v. OpenAI, and the spread of licensing deals, will influence both the economics of news and the legality of training on journalism. The reliability of generative systems, and newsrooms' ability to catch errors before publication, will determine whether incidents like the CNET corrections and the Apple Intelligence summaries remain isolated or recur. And the same tools that help journalists will continue to empower those who produce disinformation, keeping verification and provenance at the centre of the profession's response to AI.
See also: News ChatGPT Plugins