# AI Summary Generators

> Source: https://aiwiki.ai/wiki/ai_summary_generators
> Updated: 2026-07-16
> Categories: AI Tools & Products, Natural Language Processing
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

*See also: [AI Summary Generators ChatGPT Plugins](/wiki/ai_summary_generators_chatgpt_plugins)*

**AI summary generators** are software tools that use [natural language processing](/wiki/natural_language_processing) and, increasingly, [large language models](/wiki/large_language_model) to condense documents, articles, meetings, videos, emails, and other source material into shorter forms. They span a wide product surface: standalone web apps such as QuillBot and Scholarcy, browser extensions that summarize the current page, dedicated meeting note-takers such as [Otter.ai](/wiki/otter_ai), [Fireflies.ai](/wiki/fireflies_ai), and [Granola](/wiki/granola_ai), academic research assistants, video and podcast summarizers, and system-level features such as Apple's notification summaries in [Apple Intelligence](/wiki/apple_intelligence). At their core, all of these tools draw on a long tradition of automatic [text summarization](/wiki/text_summarization) research that runs from Hans Peter Luhn's 1958 paper through today's [generative AI](/wiki/generative_ai) systems.[1]

The term has become a popular consumer label since 2023, when [ChatGPT](/wiki/chatgpt), [Claude](/wiki/claude), and [Gemini](/wiki/gemini) made high-quality abstractive summarization a generic capability of any chat assistant. A second wave of dedicated products followed: meeting recorders that join Zoom and Microsoft Teams calls, notebook-style workspaces such as [NotebookLM](/wiki/google_notebooklm), and notification-level summaries baked into operating systems. Quality varies widely. The same models that produce smooth, readable summaries can also hallucinate facts, conflate sources, or misrepresent headlines, as the [BBC](https://www.bbc.co.uk/news/articles/cge93de21n0o) found when Apple's notification summaries began publishing inaccurate headlines under its name in late 2024.[22]

## Overview

A modern summary generator turns an input of length N into a much shorter output that aims to preserve the most important information. The input can be anything: a news article, a research paper, a meeting transcript, a YouTube video, an email thread, a Slack channel, or an entire book. The output usually takes one of three forms: a single paragraph, a bulleted list of key points, or a structured digest with headings and action items.

Under the hood, today's tools mostly use one of two architectures. Older standalone products such as Scholarcy, TLDR This, and some of the original browser extensions still rely on extractive techniques that pick high-salience sentences, optionally rephrased with a smaller neural model. Newer products and the summarization features in general chat assistants use abstractive [large language models](/wiki/large_language_model) with long [context windows](/wiki/context_window), often combined with [retrieval-augmented generation](/wiki/retrieval_augmented_generation) for very long inputs.

A few features distinguish summary generators from each other in practice:

- The maximum input size, governed either by the model's context window (currently 200,000 to 2 million tokens for frontier models) or by an application-level chunking strategy.
- Whether the system runs in the cloud, on the user's device, or in a hybrid arrangement (as with Apple Intelligence's Private Cloud Compute).
- The level of grounding to the source, which ranges from strict citation-only output (NotebookLM) to free-form generation that may invent details (most chat assistants without guardrails).
- Whether summaries are produced on demand or generated continuously, as with always-on meeting bots and lock-screen notification summaries.
- The output style, which varies from terse one-line digests to structured documents with sections, tables, and follow-up questions.

## History

### Extractive era (1958 to 2014)

Automatic summarization is one of the oldest tasks in [natural language processing](/wiki/natural_language_processing). The field's founding paper is Hans Peter Luhn's 1958 IBM Journal article "The Automatic Creation of Literature Abstracts," which proposed scoring sentences by the frequency of significant words and selecting the highest-scoring sentences for an abstract.[1] Luhn's heuristic, which discarded very common and very rare words and ranked the remainder, is recognizable as a precursor to modern term-frequency weighting.

Harold P. Edmundson's 1969 ACM paper "New Methods in Automatic Extracting" extended Luhn's approach with cue words (positive and negative markers such as "significant" or "hardly"), title words, and sentence position, combined into a linear scoring function.[2] His basic framework, score every sentence and pick the top k, dominated the field for decades.

Later extractive systems added graph-based ranking. TextRank (Mihalcea and Tarau, 2004) and LexRank (Erkan and Radev, 2004) treated each sentence as a node in a similarity graph and applied PageRank-style centrality, powering many open-source summarizers well into the 2010s.[3][4] The DUC and TAC shared tasks run by NIST between 2001 and 2014 turned summarization into a competitive research field, with [ROUGE](/wiki/rouge_score) becoming the dominant automatic metric after Chin-Yew Lin's 2004 paper.[5]

### Neural era (2014 to 2019)

Sequence-to-sequence neural networks shifted summarization toward abstractive methods. Rush, Chopra, and Weston's 2015 paper showed that an encoder-decoder with attention could generate headlines from news articles.[6] Pointer-generator networks (See, Liu, and Manning, 2017) added a copy mechanism that improved grounding on the CNN/DailyMail dataset.[7] Pretrained encoder-decoder [transformers](/wiki/transformer), particularly [BART](/wiki/bart) (Lewis et al., 2019) and [T5](/wiki/t5) (Raffel et al., 2020), raised state-of-the-art ROUGE scores on CNN/DailyMail, XSum, and other benchmarks.[8][9] PEGASUS (Zhang et al., 2020) introduced a gap-sentence pretraining objective designed specifically for summarization.[10]

### Generative era (2020 to present)

The release of GPT-3 in 2020 and [ChatGPT](/wiki/chatgpt) in November 2022 turned high-quality abstractive summarization into a default capability of any general-purpose chat assistant. By 2023, most consumer summarization products were either thin wrappers around [OpenAI](/wiki/openai), [Anthropic](/wiki/anthropic), or [Google](/wiki/google) APIs, or used those APIs as one engine among several.

Long-context models accelerated the shift. Anthropic released Claude 2 with a 100,000-token window in July 2023,[18] expanded [Claude](/wiki/claude) to 200,000 tokens in November 2023,[19] and previewed a 1 million-token version of Sonnet for enterprise in 2025. [Gemini](/wiki/gemini) 1.5 Pro introduced a 1 million-token window in February 2024[20] and a 2 million-token preview later that year. OpenAI's GPT-4 Turbo reached 128,000 tokens in late 2023. These windows reshaped academic, legal, and enterprise summarization by removing the chunking step that had previously dominated the engineering work.

## General-purpose AI summarizers

The most common way to generate an AI summary in 2024 to 2026 is to paste text into a general chat assistant. The leading consumer assistants all summarize as a first-class use case.

### ChatGPT

[ChatGPT](/wiki/chatgpt) from [OpenAI](/wiki/openai) was launched on November 30, 2022. Summarization is one of the most common tasks users perform with it. The product supports file uploads (PDFs, Word documents, spreadsheets) and a Browse feature that fetches and condenses URLs. With GPT-4 Turbo and GPT-4o, context windows reached 128,000 tokens. ChatGPT is also the backend for many third-party summary tools that wrap its API.

### Claude

Anthropic's [Claude](/wiki/claude) is known in summarization workflows for its long context and conservative behavior on factual grounding. Claude 2 launched with a 100,000-token window in July 2023,[18] Claude 2.1 expanded to 200,000 tokens in November 2023,[19] and a 1 million-token version of Sonnet entered enterprise preview in 2025. Claude is commonly used to summarize legal contracts, financial filings, and long-form research that must be processed in one pass.

### Gemini

[Gemini](/wiki/gemini) is Google's general assistant. Gemini Advanced launched in February 2024, and Gemini 1.5 introduced 1 million- and 2 million-token context windows useful for very long documents, codebases, and video.[20] Gemini powers summarization across Google Workspace (the Gemini side panel in Gmail, Docs, and Meet) and drives Google Search AI Overviews.

### Microsoft Copilot

[Microsoft Copilot](/wiki/microsoft_copilot) wraps OpenAI's models and Microsoft's own Phi and Prometheus models. The consumer Copilot app and Copilot in Microsoft 365 summarize Word documents, PowerPoint decks, Outlook threads, and Teams recordings. Copilot Pages, launched in late 2024, is a collaborative canvas for editing and sharing summary output.

### Perplexity

[Perplexity AI](/wiki/perplexity_ai), launched in December 2022, sits between a chat assistant and a search engine. Each answer is grounded in a small set of retrieved sources and presented as a summary with inline citations. It is commonly used to summarize current news where source grounding matters more than raw fluency. Perplexity offers Sonar, Sonar Pro, and partner models from OpenAI and Anthropic as backend choices.

## Standalone summarization apps and extensions

A second category of summary generators are standalone products that focus narrowly on text summarization, often through a web app and a browser extension.

| Product | First released | Headquarters | Notes |
|---|---|---|---|
| [QuillBot Summarizer](/wiki/quillbot) | 2017 (parent QuillBot); summarizer added 2020 | Chicago, Illinois | Part of QuillBot's broader writing suite, acquired by Course Hero (Learneo) in August 2021. Free tier supports up to a few thousand words; paid tier extends limits. |
| TLDR This | 2020 | Sydney, Australia | Generates short and detailed summaries from URLs or pasted text. Used widely by students and news readers. Offers a browser extension. |
| Resoomer | 2014 (web), AI relaunch 2023 | Paris, France | Long-running French summarization tool. Modes include automatic, manual highlight, and chat-style summarization. Browser extensions for Chrome and Firefox. |
| Smodin Summarizer | 2018 | United States | Web-based summarizer with adjustable length and tone. Part of Smodin's broader writer suite. |
| SummarizeBot | 2017 | Riga, Latvia | API-first summarizer that supports text, audio, video, and images. Integrations include Slack and Telegram bots. |
| Scholarcy | 2018 | London, United Kingdom | Academic summarizer that converts research papers into structured "flashcards" with figures, tables, references, and synthesis. Browser extension and library service. |
| Jasper Summarize | 2021 (Jasper AI), template added 2022 | Austin, Texas | A summarization template inside the Jasper AI marketing writing platform, geared toward repurposing long content into blog excerpts and social posts. |
| INK Summarize | 2018 (INK), AI rewrite added 2022 | Wilmington, Delaware | Summarization tool in INK's SEO writing platform, focused on shortening competitor research and source material. |
| Glasp | 2021 | Mountain View, California | Highlight and summarize browser extension. Generates ChatGPT-style summaries of articles and YouTube videos, with social highlight sharing. |
| MaxAI.me | 2023 | San Francisco, California | Browser sidebar that summarizes the current page, email, or YouTube video and offers other LLM actions. Supports multiple backends. |
| Wiseone | 2022 | Paris, France | Reading assistant browser extension that surfaces context and summaries while reading long-form articles. |

Many of these products route requests to OpenAI, Anthropic, or Google APIs. Their differentiation is in user experience: keyboard shortcuts, integration into the reading flow, support for non-text content, and the ability to organize past summaries into a personal library.

## NotebookLM and document workspace summarizers

[NotebookLM](/wiki/google_notebooklm) is Google's source-grounded research and summarization workspace. It was first shown at Google I/O in May 2023 under the codename Project Tailwind, launched as NotebookLM in July 2023, and powered by Google's [Gemini](/wiki/gemini) models.[24] Users upload up to 50 sources per notebook (PDFs, Google Docs, websites, YouTube transcripts, pasted text) and receive answers, briefings, and digests that are grounded in those sources with inline citations.

NotebookLM gained mainstream attention in September 2024 with Audio Overviews, which generate a roughly nine-minute podcast-style conversation between two AI hosts about the uploaded sources.[25] The feature went viral on social media. Google added interactive audio hosts in late 2024, mind maps and study guides in early 2025, and Video Overviews later that year. By late 2025, Google reported NotebookLM had reached around 17 million monthly active users.

Document workspace summarizers in the same general category include **Mem.ai** and **Reflect**, which use LLMs to summarize a user's own notes; **[Notion AI](/wiki/notion_ai)**, launched in November 2022, which adds summarize, rewrite, and Q&A actions inside Notion pages; **Coda AI**, which provides summarization blocks inside Coda docs; and **Microsoft Loop**, which integrates Copilot summaries into shared workspaces. These products differ from generic chat assistants in that summaries are anchored to a defined set of user-controlled documents rather than the open web, which reduces the surface area for hallucination.

## Meeting and conversation summarizers

Meeting summarization is the largest commercial subcategory of AI summary generators by revenue and one of the fastest-growing. A meeting summarizer typically joins a video call or captures the local microphone, transcribes the conversation using [automatic speech recognition](/wiki/speech_recognition), and produces a structured summary that includes key points, decisions, and action items.

| Product | Founded | Headquarters | Approach |
|---|---|---|---|
| [Otter.ai](/wiki/otter_ai) | 2016 (as AISense) | Mountain View, California | Real-time transcription with AI summaries and OtterPilot, an agent that joins Zoom, Google Meet, and Microsoft Teams calls. Reported over 25 million users and more than $100 million in ARR by March 2025.[26] |
| [Fireflies.ai](/wiki/fireflies_ai) | 2016 (incorporated 2017) | San Francisco, California | Bot-based meeting assistant with AskFred conversational search across meeting history. Crossed a $1 billion valuation in June 2025 with a reported 20 million users.[27] |
| Read.ai | 2021 | Seattle, Washington | Meeting copilot that tracks engagement metrics in addition to summaries. Expanded into email, chat, and an always-on personal AI in 2024 and 2025. |
| [Granola](/wiki/granola_ai) | March 2023 | London, United Kingdom | Captures audio locally so no bot joins the meeting. The user types short notes during the call; Granola enriches them with the full transcript afterward. Raised $67 million in total by early 2026 at a $250 million valuation.[28] |
| Fathom | 2020 | San Francisco, California | Free meeting recorder for Zoom and Google Meet, with paid teams plan. Generates structured summaries, action items, and CRM integrations. |
| [Krisp Notes](/wiki/krisp_ai) | 2017 (Krisp); Notes added 2023 | Berkeley, California | Adds meeting transcription and summarization to Krisp's existing noise-cancellation platform; bot-less capture from any conferencing tool. |
| Tactiq | 2020 | Sydney, Australia | Chrome extension that captures Google Meet, Zoom, and Microsoft Teams transcripts and produces AI summaries and action items. |
| Sembly AI | 2019 | New York, New York | Bot-based meeting assistant with multi-language support and team analytics. |
| Avoma | 2017 | Palo Alto, California | Meeting assistant focused on revenue teams, with conversation intelligence and CRM updates. |

The major conferencing platforms have built equivalent features in house:

- **Zoom AI Companion** launched in September 2023 as Zoom IQ Meeting Summary and was renamed AI Companion later that year. It generates meeting summaries, chat summaries, and email drafts at no additional cost on paid Zoom plans. A controversy in August 2023 over Zoom's terms of service, which appeared to permit training on customer data without explicit consent, forced the company to revise its policy and add explicit opt-in language.[35]
- **Microsoft Teams Intelligent Recap**, available with Teams Premium and Copilot for Microsoft 365 since late 2023, produces meeting recaps with auto-generated chapters, mentions, and follow-up tasks.
- **Google Meet "Take notes for me"** was announced at Google Cloud Next in April 2024 and rolled out broadly later that year.[36] It generates a Google Doc with key points and decisions and emails the doc to the meeting organizer.

Meeting summarizers are the area where users have raised the most pointed privacy concerns; many products now require explicit notice that a bot is recording, and all-party-consent U.S. states such as California require consent from every participant.

## Email summarizers

Email is one of the highest-volume summarization targets and one of the first places consumers encountered AI summaries by default.

- **Superhuman AI** added AI Insights and AI-generated thread summaries in 2023 and expanded the feature set through 2024 and 2025. The client also offers AI reply drafts trained on the user's writing style.
- **Shortwave**, founded by ex-Google engineers in 2020, built a Gmail front end with AI search, thread summarization, and reply drafts. It uses both [OpenAI](/wiki/openai) and [Anthropic](/wiki/anthropic) models.
- **HEY** introduced AI summaries for long threads in 2024, presented as an optional companion view rather than an interruption.
- **Gmail**'s Gemini side panel generates thread summaries and reply drafts on Workspace plans that include Gemini.
- **Outlook with Copilot** summarizes long email chains and meeting threads on Microsoft 365 plans that include Copilot.
- **Apple Mail summaries**, part of [Apple Intelligence](/wiki/apple_intelligence), replaced the standard preview line in the iOS Mail app starting with iOS 18.1 in October 2024.[21]

Quality is uneven. Apple's first version of mail summaries was widely panned for collapsing two-sentence emails into nearly identical two-sentence summaries. Superhuman's summaries got better press because they only kick in for genuinely long threads.

## Apple Intelligence notification summaries

Notification summaries are the most widely deployed AI summary feature in the world by user count, simply because they ship by default on every Apple Intelligence-capable iPhone, iPad, and Mac. The feature condenses multiple notifications from the same app, or a single long notification, into a one- or two-line summary displayed on the lock screen or in the notification stack.

The feature launched on October 28, 2024, with iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1.[21] It runs primarily on the device-side Apple Foundation model, with a roughly three-billion-parameter on-device language model handling most summary generation.

### BBC News and the December 2024 to January 2025 incident

Within weeks of launch, screenshots circulated of inaccurate notification summaries attributed to news outlets. The most prominent involved [BBC News](https://www.bbc.co.uk/news). In mid-December 2024, a notification summary stated that Luigi Mangione, the suspect in the killing of UnitedHealthcare CEO Brian Thompson, had "shot himself," which was not true.[22] The summary was attributed to the BBC even though the BBC had reported no such thing. The BBC filed a formal complaint on December 13, 2024.[22]

More examples surfaced. A notification incorrectly summarized a New York Times alert as saying Israeli Prime Minister Benjamin Netanyahu had been arrested.[22] A summary about Rafael Nadal incorrectly stated the tennis player had come out as gay.[29] A summary of a darts story attributed a championship win to Luke Littler before the final had been played. Reporters Without Borders called on Apple to remove the feature.[29]

Apple's response came in stages. On January 16, 2025, the company released iOS 18.3 beta with notification summaries disabled by default for News and Entertainment apps and a clear italicized indicator that any summary was AI-generated.[23] The production release of iOS 18.3 kept the same restrictions. As of mid-2026, notification summaries for News and Entertainment remained off pending accuracy improvements, and a class-action over Apple Intelligence marketing reached a $250 million settlement in May 2026.

The incident is now a standard case study. The technical problem is straightforward: the on-device model was being asked to compress already tightly worded news headlines, where further compression almost guarantees information loss, with no mechanism to verify claims against the underlying article. The reputational problem was broader; summaries appeared under the BBC's banner even though Apple had generated them.

## Academic and research summarizers

Summarizing scientific papers is one of the oldest application areas for automatic summarization. Several dedicated products have built workflows around this domain.

- **Scholarcy** generates structured "flashcard" summaries of papers, with sections for purpose, methods, findings, comparisons, limitations, and references. The London company licenses to libraries and universities.
- **SciSpace** (formerly Typeset), based in Bengaluru, India, combines paper search with a Copilot that summarizes selected text, equations, and figures. SciSpace acquired **Explainpaper** in 2023.
- **Semantic Scholar TLDR** is a feature of the [Allen Institute for AI](/wiki/allen_institute_for_ai)'s academic search engine. Each qualifying paper receives a one- or two-sentence TLDR produced by SciTLDR, introduced in Cachola et al.'s EMNLP 2020 paper "TLDR: Extreme Summarization of Scientific Documents."[11]
- **PaperQA** and the follow-up **PaperQA2**, developed by FutureHouse and collaborators, are open-source retrieval-augmented systems that produce literature summaries with full citations. A 2024 evaluation showed PaperQA2 producing literature reviews comparable to those written by domain experts, with much lower hallucination than direct LLM use.[32]
- **Elicit**, originally an Ought research project and now a standalone company, summarizes research questions across thousands of papers and is widely used for evidence reviews.
- **Consensus** is a similar product focused on yes/no research questions, with synthesized answers across the relevant literature.

Long-context [LLMs](/wiki/large_language_model) reshaped this category. By 2024, it was feasible to feed an entire 60-page review article into [Claude](/wiki/claude) or [Gemini](/wiki/gemini) and get a section-by-section summary in one call. Specialized products still win on workflow, citation hygiene, and integration with reference managers like Zotero and Mendeley.

## Video and audio summarizers

Video and audio summarizers transcribe and condense long-form spoken content into text. They are especially common for YouTube, podcasts, and recorded lectures.

- **YouTube Summary with ChatGPT & Claude**, originally a Chrome extension by Glasp launched in early 2023, adds a summary panel to YouTube watch pages. The extension pulls the video transcript and sends it to a chosen model.
- **Eightify**, based in Tbilisi, Georgia, is a similar YouTube extension that produces bullet-point summaries with chapter timestamps and a key-quotes view.
- **Mindgrasp**, based in the United States, summarizes YouTube videos, PDFs, audio files, and live lectures into notes, flashcards, and quiz questions. It targets students.
- **Summarize.tech** is a lighter web tool that produces a paragraph summary from a YouTube URL.
- **Snipd**, a podcast app, generates AI summaries and "snip" highlight clips from podcast episodes.
- **YouTube's own AI summary** feature, rolled out gradually in 2024 and 2025, shows a topic summary beneath supported videos when the user opens the description.

These products generally rely on YouTube's auto-generated captions where available, or on an [automatic speech recognition](/wiki/speech_recognition) model such as Whisper for podcasts and uploaded audio. Quality depends heavily on transcript quality; heavy accents, music, and overlapping speech all degrade output.

## Evaluation

Evaluating summarization is notoriously hard. A summary can be short, fluent, and grammatically perfect while still missing the main point or inventing facts that are not in the source. Several metrics are in standard use.

### ROUGE

[ROUGE](/wiki/rouge_score) (Recall-Oriented Understudy for Gisting Evaluation), introduced by Chin-Yew Lin in 2004, is the most widely cited automatic summarization metric.[5] ROUGE compares a generated summary against one or more human reference summaries and reports n-gram overlap. The most common variants are ROUGE-1 (unigram overlap), ROUGE-2 (bigram overlap), and ROUGE-L (longest common subsequence).[5] ROUGE is simple to compute and reproducible, which is why it has remained the default reporting metric for two decades. Its weaknesses are well documented: it rewards lexical overlap rather than semantic similarity, penalizes valid paraphrases, and correlates only weakly with human judgments of factuality.[15]

### BERTScore

BERTScore (Zhang et al., 2019) uses contextual embeddings from [BERT](/wiki/bert) to compute token-level similarity between candidate and reference summaries.[12] Unlike ROUGE, it rewards paraphrases that preserve meaning. BERTScore correlates more strongly with human judgments on many summarization datasets, especially for abstractive systems.[12]

### Factuality metrics

A newer family of metrics focuses on factual consistency between a summary and its source, which standard overlap metrics ignore entirely.

- **SummaC** (Laban et al., 2022) uses natural language inference (NLI) models, applied at the sentence level, to score whether each sentence in a summary is entailed by the source document. The paper introduced two variants, SummaC-ZS (zero-shot) and SummaC-Conv (a convolutional aggregation).[13]
- **QAGS** (Wang, Cho, and Lewis, 2020) and **FEQA** (Durmus, He, and Diab, 2020) generate questions from a candidate summary and check whether a QA model answers them the same way from the source.
- **Q^2** (Honovich et al., 2021) automates factual consistency evaluation for knowledge-grounded dialogue using question generation and answer matching.[14]
- **FactScore** (Min et al., 2023) decomposes generated text into atomic facts and checks each against a reliable knowledge source.[33]

### Benchmarks and datasets

- **SummEval** (Fabbri et al., 2021) re-evaluated 16 summarization systems on CNN/DailyMail with crowd and expert annotations, finding that most automatic metrics correlate only weakly with human judgments.[15]
- **CNN/DailyMail** (Hermann et al., 2015) is the long-standing single-document benchmark with around 300,000 article-summary pairs.[34]
- **XSum** (Narayan et al., 2018) pairs BBC news articles with single-sentence summaries and is intentionally more abstractive than CNN/DailyMail.[16]
- **MultiNews** (Fabbri et al., 2019) provides multi-document summarization data from newser.com.[31]
- **arXiv** and **PubMed** datasets (Cohan et al., 2018) provide long-document summarization data with structured abstracts as targets.[30]
- **SciTLDR** (Cachola et al., 2020) is the extreme-summarization dataset behind Semantic Scholar's TLDR feature.[11]

Production systems are increasingly evaluated through LLM-as-judge protocols, where a separate model rates summaries on faithfulness, completeness, and conciseness. These protocols are faster than crowdsourcing but inherit biases from the judge model.

## Known failure modes

### Hallucinated facts

Abstractive summaries frequently include details not present in the source. Maynez et al. (2020) found that more than 70% of summaries generated by then-state-of-the-art models on XSum contained at least one hallucinated fact.[17] Modern [LLMs](/wiki/large_language_model) hallucinate less in absolute terms, but the failure mode persists, especially under aggressive length constraints.

### Compression artifacts in news headlines

When the input is already condensed, as with a push notification or breaking-news alert, further summarization risks erasing crucial qualifiers ("alleged," "police say," "according to a source"). The Apple Intelligence BBC incident in December 2024 is a textbook example. The on-device model received a notification headline along the lines of "Mangione was the suspect police were searching for" and the corresponding summary collapsed multiple notifications into the inaccurate, defamatory statement that he "shot himself."[22]

### Missed nuance and false confidence

Summaries flatten hedging language. Phrases like "some researchers argue" or "in preliminary data" routinely disappear in compressed forms, leaving categorical statements behind. This is particularly damaging for medical, legal, and scientific summaries.

### Length-limit issues

Product UX often imposes a target length (one paragraph, three bullets, 280 characters) that is shorter than the source can plausibly be reduced to without losing content. Models comply with the constraint by dropping information rather than by signaling that the requested compression is too aggressive.

### Citation drift

When a summary is the user-facing product but the source is required for trust, citation hygiene matters. Retrieval-augmented systems such as [Perplexity AI](/wiki/perplexity_ai), [NotebookLM](/wiki/google_notebooklm), and PaperQA tie every claim to a source span, but even these occasionally cite the right source for the wrong claim or invent quotes not present in the cited document.

### Bias inherited from training data

Summary generators inherit biases from the LLMs that drive them. Studies of news summarization have documented systematic underrepresentation of viewpoints and the introduction of frames that did not appear in the source.

### Privacy and consent in meeting capture

In all-party-consent jurisdictions, every meeting participant must be informed and must consent to recording. Several bot-based products have been the subject of complaints when they joined calls without clear notice, and Read.ai's always-on personal AI mode drew widely shared concerns in 2024.

## See also

- [Text summarization](/wiki/text_summarization)
- [Summarization models](/wiki/summarization_models)
- [ROUGE](/wiki/rouge_score)
- [Natural language processing](/wiki/natural_language_processing)
- [Large language model](/wiki/large_language_model)
- [Hallucination](/wiki/hallucination)
- [Apple Intelligence](/wiki/apple_intelligence)
- [NotebookLM](/wiki/google_notebooklm)
- [Otter.ai](/wiki/otter_ai)
- [Fireflies.ai](/wiki/fireflies_ai)
- [Granola](/wiki/granola_ai)
- [QuillBot](/wiki/quillbot)
- [Perplexity AI](/wiki/perplexity_ai)
- [ChatGPT](/wiki/chatgpt)
- [Claude](/wiki/claude)
- [Gemini](/wiki/gemini)
- [Microsoft Copilot](/wiki/microsoft_copilot)
- [Krisp](/wiki/krisp_ai)
- [Notion AI](/wiki/notion_ai)
- [BERT](/wiki/bert)
- [Context window](/wiki/context_window)
- [Speech recognition](/wiki/speech_recognition)

## References

1. Luhn, H. P. (1958). "The Automatic Creation of Literature Abstracts." IBM Journal of Research and Development. https://web.stanford.edu/class/linguist289/luhn57.pdf
2. Edmundson, H. P. (1969). "New Methods in Automatic Extracting." Journal of the ACM. https://courses.ischool.berkeley.edu/i256/f06/papers/edmonson69.pdf
3. Mihalcea, R., and Tarau, P. (2004). "TextRank: Bringing Order into Text." EMNLP 2004.
4. Erkan, G., and Radev, D. (2004). "LexRank: Graph-based Lexical Centrality as Salience in Text Summarization." Journal of Artificial Intelligence Research.
5. Lin, C.-Y. (2004). "ROUGE: A Package for Automatic Evaluation of Summaries." Text Summarization Branches Out, ACL Workshop. https://aclanthology.org/W04-1013/
6. Rush, A. M., Chopra, S., and Weston, J. (2015). "A Neural Attention Model for Abstractive Sentence Summarization." EMNLP 2015.
7. See, A., Liu, P. J., and Manning, C. D. (2017). "Get To The Point: Summarization with Pointer-Generator Networks." ACL 2017.
8. Lewis, M., et al. (2019). "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension." https://arxiv.org/abs/1910.13461
9. Raffel, C., et al. (2020). "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer." Journal of Machine Learning Research.
10. Zhang, J., et al. (2020). "PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization." ICML 2020.
11. Cachola, I., Lo, K., Cohan, A., and Weld, D. S. (2020). "TLDR: Extreme Summarization of Scientific Documents." Findings of EMNLP 2020. https://aclanthology.org/2020.findings-emnlp.428/
12. Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2019). "BERTScore: Evaluating Text Generation with BERT." ICLR 2020. https://arxiv.org/abs/1904.09675
13. Laban, P., Schnabel, T., Bennett, P. N., and Hearst, M. A. (2022). "SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization." TACL.
14. Honovich, O., et al. (2021). "Q^2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering." EMNLP 2021.
15. Fabbri, A. R., et al. (2021). "SummEval: Re-evaluating Summarization Evaluation." TACL.
16. Narayan, S., Cohen, S. B., and Lapata, M. (2018). "Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization." EMNLP 2018.
17. Maynez, J., Narayan, S., Bohnet, B., and McDonald, R. (2020). "On Faithfulness and Factuality in Abstractive Summarization." ACL 2020. https://aclanthology.org/2020.acl-main.173/
18. Anthropic. "Introducing 100K Context Windows." May 11, 2023. https://www.anthropic.com/news/100k-context-windows
19. Anthropic. "Claude 2.1." November 21, 2023. https://www.anthropic.com/news/claude-2-1
20. Google. "Our next-generation model: Gemini 1.5." February 15, 2024. https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/
21. Apple. "Apple Intelligence is available today on iPhone, iPad, and Mac." October 28, 2024. https://www.apple.com/newsroom/2024/10/apple-intelligence-is-available-today-on-iphone-ipad-and-mac/
22. BBC News. "Apple urged to withdraw 'out of control' AI news alerts." December 18, 2024. https://www.bbc.co.uk/news/articles/cge93de21n0o
23. CNBC. "Apple disables AI notifications for news in its beta iPhone software." January 16, 2025. https://www.cnbc.com/2025/01/16/apple-disables-ai-notifications-for-news-in-its-beta-iphone-software.html
24. Google. "Introducing NotebookLM, a new AI-powered research assistant." July 12, 2023. https://blog.google/technology/ai/notebooklm-google-ai/
25. Google. "Audio Overviews in NotebookLM." September 11, 2024. https://blog.google/technology/ai/notebooklm-audio-overviews/
26. TechCrunch. "Otter.ai surpasses $100M ARR as enterprise demand drives meeting AI adoption." March 11, 2025.
27. TechCrunch. "Fireflies.ai hits $1B valuation through tender offer." June 2025.
28. The Verge. "Granola, the AI note-taker, raises $43 million." April 30, 2025.
29. Reporters Without Borders. "Apple must remove its generative AI feature after a fresh false headline." January 2025.
30. Cohan, A., et al. (2018). "A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents." NAACL-HLT 2018. https://aclanthology.org/N18-2097/
31. Fabbri, A. R., Li, I., She, T., Li, S., and Radev, D. R. (2019). "Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model." ACL 2019.
32. Lala, J., et al. (2024). "PaperQA2: Language Models Can Generate Literature Reviews." FutureHouse / Stanford. https://arxiv.org/abs/2409.13740
33. Min, S., et al. (2023). "FactScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation." EMNLP 2023.
34. Hermann, K. M., et al. (2015). "Teaching Machines to Read and Comprehend." NeurIPS 2015.
35. Reuters. "Zoom updates terms after AI-training backlash." August 11, 2023.
36. Google Cloud Blog. "Take notes for me in Google Meet." Google Cloud Next, April 2024.