# Grounding (artificial intelligence)

> Source: https://aiwiki.ai/wiki/grounding
> Updated: 2026-06-23
> Categories: AI Safety, Artificial Intelligence, Large Language Models, Natural Language Processing
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

Grounding in [artificial intelligence](/wiki/artificial_intelligence) is the process of anchoring an AI system's outputs to verifiable, real-world information so that each claim can be traced back to an external source rather than generated from memory alone. Instead of letting a model answer purely from patterns learned during training, grounding connects its outputs to retrieved documents, search results, databases, or sensory inputs that can be independently checked. The technique is the primary engineering defense against [hallucination](/wiki/hallucination), the tendency of [large language models](/wiki/large_language_model) (LLMs) to produce plausible-sounding but factually incorrect statements. In well-configured retrieval systems, grounding can reduce hallucination rates substantially, and Google DeepMind's FACTS Benchmark Suite shows that even the strongest grounded models top out below 70% factuality, with Gemini 3 Pro leading at a FACTS Score of 68.8% as of early 2026.[6]

Grounding operates at the intersection of information retrieval, knowledge representation, and language generation. When an AI system is grounded, its responses are tethered to specific documents, databases, search results, or other knowledge sources, and users can trace claims back to their origins. This traceability is what separates a grounded response from a speculative one.

## What is the symbol grounding problem?

The idea of grounding in AI predates the current wave of large language models by several decades. In 1990, cognitive scientist Stevan Harnad published "The Symbol Grounding Problem" in *Physica D: Nonlinear Phenomena*, posing a question that remains relevant today: how can the symbols manipulated by a computational system acquire meaning intrinsic to the system, rather than meaning parasitic on human interpretation?[1] Harnad framed the core puzzle as "How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols?"[1] He compared the problem to trying to learn Chinese using only a Chinese-to-Chinese dictionary. Without some connection to real-world experience, the symbols remain unanchored.

Harnad proposed that symbolic representations must be grounded "bottom-up" in two kinds of nonsymbolic representations: iconic representations (analogs of sensory projections of distal objects and events) and categorical representations (learned feature detectors that pick out invariant features of object categories).[1] He suggested that [connectionism](/wiki/connectionism) could serve as the mechanism for learning these invariant features, acting as a bridge between raw sensory data and symbolic labels.[1]

The symbol grounding problem was originally framed in the context of classical AI and cognitive science. With the rise of LLMs in the 2020s, the problem has taken on new practical urgency. Modern language models process tokens statistically without any built-in connection to the world those tokens describe. Grounding techniques have emerged as engineering solutions to this gap.

## Why does grounding matter for LLMs?

Large language models generate text by predicting the most likely next token in a sequence. This process produces fluent, coherent text, but it carries no guarantee of factual accuracy. The model has no internal concept of truth; it has a model of plausibility based on training data. When the training data is incomplete, outdated, or ambiguous, the model may confidently produce incorrect statements.

Grounding addresses this problem by providing the model with access to authoritative information at inference time. Instead of relying entirely on parametric knowledge (what the model learned during training), a grounded system retrieves relevant external data and uses it to inform the response. The benefits include:

- **Reduced hallucination rates**: By constraining the model's output to information supported by retrieved sources, grounding can decrease hallucinations by 60 to 80 percent in well-configured systems.
- **Access to current information**: LLM training data has a cutoff date. Grounding through web search or live databases allows the model to answer questions about recent events.
- **Verifiability**: Grounded responses can include citations pointing to specific sources, enabling users to check claims independently.
- **Domain specificity**: Enterprise deployments can ground models in proprietary data (internal documents, product catalogs, customer records) that was never part of public training data.

## What are the main grounding techniques?

Several approaches to grounding have emerged, each suited to different use cases. They can be combined in a single system.

### Retrieval-augmented generation (RAG)

[Retrieval-augmented generation](/wiki/retrieval_augmented_generation) (RAG) is the most widely adopted grounding technique. Introduced by Lewis et al. at Facebook AI Research in 2020, RAG adds an information retrieval step before text generation.[2] The process works in three stages:

1. **Retrieve**: Given a user query, the system searches a knowledge base (vector database, document store, or search index) for relevant passages.
2. **Augment**: The retrieved passages are inserted into the model's prompt as additional context.
3. **Generate**: The model produces a response informed by both its parametric knowledge and the retrieved context.

RAG systems typically use [embedding](/wiki/embeddings) models to convert both queries and documents into vector representations, then find the most semantically similar documents using approximate nearest neighbor search.[2] Popular vector databases for RAG include [Pinecone](/wiki/pinecone), [Weaviate](/wiki/weaviate), [Chroma](/wiki/chroma), and [FAISS](/wiki/faiss).

The strength of RAG lies in its flexibility. The knowledge base can be updated independently of the model, allowing the system to stay current without retraining. However, RAG quality depends heavily on the retrieval step: if the wrong documents are retrieved, the model may still produce inaccurate responses or ignore the retrieved context entirely.

### Web search grounding

Web search grounding connects a language model to a search engine, allowing it to query the live web for information before generating a response. This approach is especially useful for questions about current events, rapidly changing data, or topics not well represented in the model's training data.

Several major AI platforms have implemented web search grounding:

**Google Grounding with Google Search**: Google offers grounding as a built-in tool for the [Gemini](/wiki/gemini) API. When enabled, the model can analyze a prompt, determine whether a web search would improve the answer, automatically generate search queries, execute them, and synthesize findings into a cited response.[8] The API returns structured grounding metadata including the search queries used, grounding chunks (source URLs and titles), and grounding supports (mappings from specific text segments in the response to their source chunks).[8] This metadata allows developers to build inline citation experiences. Google has extended this capability to include Grounding with Google Maps for spatial and local business queries.[7]

**[Perplexity AI](/wiki/perplexity_ai)**: [Perplexity](/wiki/perplexity) is an answer engine built entirely around search grounding. Every response begins with a live web search. Retrieved documents are processed through embedding and extraction pipelines to identify relevant snippets, which are then fed to an LLM alongside the original query. A core design principle at Perplexity is that the model should not say anything it did not retrieve, which goes beyond standard RAG by explicitly prohibiting unsupported claims. Each response includes numbered inline citations linking to the original sources.

**OpenAI ChatGPT Search**: [ChatGPT](/wiki/chatgpt) integrated web search capabilities, making search available to all users in February 2025.[12] The system retrieves information from the web before responding, and responses include links to relevant source pages.[12] The web search tool is also available through the [OpenAI API](/wiki/openai_api) via the Responses API, where models such as gpt-4o-search-preview are designed for search-augmented generation.

**Anthropic Claude Web Search**: [Anthropic](/wiki/anthropic) introduced a web search tool for the Claude API, allowing [Claude](/wiki/claude) models to perform web searches during generation.[11] When the model determines that a query requires information beyond its training data, it generates targeted search queries, analyzes the results, and provides a response with citations to source materials.[11] The tool is available for Claude 3.7 Sonnet, the upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku.[11]

### Citation generation

Citation generation is the mechanism by which a grounded AI system attributes specific claims to specific sources. Rather than simply using retrieved information to shape a response, citation generation creates an explicit link between output text and the documents that support it.

Anthropic's Citations API, introduced in January 2025 for Claude models, allows developers to supply source documents in the context window.[10] During generation, Claude automatically cites its output with references to the exact sentences and passages it used, producing verifiable and traceable responses.[10] Anthropic reported that the feature increased recall accuracy by up to 15 percent compared to custom prompt-based citation approaches.[10]

Google's Gemini API returns grounding supports that map segments of the generated text to grounding chunks (source URLs), enabling developers to render inline citations programmatically.[8]

Recent research has explored citation as an architectural principle. A 2024 study on citation-grounded code comprehension achieved 92 percent citation accuracy with zero hallucinations by combining BM25 sparse matching, BGE dense embeddings, and Neo4j graph expansion through import relationships, outperforming single-mode baselines by 14 to 18 percentage points.[15]

### Tool use and function calling

[Tool use](/wiki/tool_use) (also called function calling) is a grounding technique in which the model can invoke external tools, APIs, or functions to obtain factual information. Instead of generating an answer from memory, the model recognizes that it needs specific data, calls the appropriate tool, and incorporates the returned data into its response.

Examples include:

- Calling a weather API to answer "What is the temperature in Tokyo right now?"
- Querying a database to retrieve a customer's order history.
- Running a code interpreter to verify a mathematical calculation.
- Accessing a calendar API to check schedule availability.

Tool use grounds the model in live, structured data and is especially useful for tasks that require precision (dates, numbers, prices) or access to private systems. Major LLM providers including [OpenAI](/wiki/openai), Anthropic, and [Google](/wiki/google) support tool use in their APIs.

### Knowledge graph grounding

[Knowledge graphs](/wiki/knowledge_graph) provide structured, relational representations of facts. Grounding an LLM with a knowledge graph involves querying the graph for entities and relationships relevant to the user's question, then providing this structured information as context. Unlike unstructured document retrieval, knowledge graph grounding can supply precise factual triples (e.g., "Paris - capital of - France") and navigate multi-hop relationships.

This approach is common in enterprise settings where organizations maintain domain-specific knowledge graphs covering products, processes, or regulatory requirements.

## What is visual grounding?

Visual grounding is a distinct subfield that connects natural language descriptions to specific regions within images or video. While textual grounding anchors language model outputs to factual sources, visual grounding anchors language to visual perception.

### Definition and tasks

Visual grounding, also called Referring Expression Comprehension (REC), involves localizing a specific region within an image based on a textual description (a "referring expression").[16] The field encompasses several related tasks:

| Task | Description | Output |
|------|-------------|--------|
| Referring Expression Comprehension (REC) | Locate the object described by a sentence | Bounding box |
| Referring Expression Segmentation (RES) | Segment the object described by a sentence | Pixel [mask](/wiki/mask_benchmark) |
| Phrase Grounding | Locate multiple objects described by noun phrases | Multiple bounding boxes |
| Generalized Visual Grounding | Ground one, multiple, or zero objects from textual input | Variable bounding boxes |

Visual grounding is foundational to applications such as visual question answering, multimodal dialogue systems, robotic instruction following, and interactive image editing.[16]

### Grounding DINO

Grounding DINO is an open-set [object detection](/wiki/object_detection) model developed by IDEA Research.[17] The paper, authored by Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, and colleagues, was first released on arXiv in March 2023 and published at [ECCV](/wiki/eccv) 2024.[3]

The model combines the DINO ([DETR](/wiki/detr) with Improved deNoising anchOr boxes) [transformer](/wiki/transformer)-based detector with grounded pre-training, enabling it to detect arbitrary objects specified through natural language inputs such as category names or descriptive phrases.[3] The architecture consists of three key components:

1. **Feature enhancer**: Fuses language and vision features through cross-modality attention.
2. **Language-guided query selection**: Uses language features to guide the selection of detection queries, scoring image features against text tokens.
3. **Cross-modality decoder**: Performs alternating self-attention with image-to-text and text-to-image cross-attention for refined detection.

Grounding DINO achieved 52.5 AP on the COCO detection zero-shot transfer benchmark without any COCO training data and set a record of 26.1 mean AP on the ODinW zero-shot benchmark.[3]

**Grounding DINO 1.5**, released in May 2024, includes two variants:[4]

| Model | LVIS-minival AP (zero-shot) | COCO AP | Speed (TensorRT) | Focus |
|-------|---------------------------|---------|-------------------|-------|
| Grounding DINO 1.5 Pro | 55.7 | 54.3 | N/A | Maximum accuracy |
| Grounding DINO 1.5 Edge | 36.2 | N/A | 75.2 FPS | Edge deployment |

The Pro model scales up the architecture with an enhanced vision backbone and training on over 20 million images with grounding annotations. When fine-tuned on LVIS, the Pro model reaches 68.1 AP on LVIS-minival.[4] The Edge model is optimized for real-time inference on resource-constrained devices.[4]

## How does grounding differ from RAG?

Grounding and [retrieval-augmented generation](/wiki/retrieval_augmented_generation) are closely related but not identical concepts. Understanding the distinction helps clarify how modern AI systems are designed.

| Aspect | Grounding | RAG |
|--------|-----------|-----|
| Definition | The outcome of tethering AI responses to verifiable facts | A specific technical method for connecting LLMs to external data |
| Scope | Broad concept covering any technique that anchors AI output to reality | A particular architecture (retrieve, augment, generate) |
| Relationship | The goal | One means of achieving the goal |
| Other methods | Web search, tool use, knowledge graphs, citation APIs | Primarily document retrieval and prompt augmentation |
| Analogy | The destination | One route to the destination |

RAG is one of several techniques for achieving grounding. Other grounding methods include web search integration, tool use, knowledge graph queries, and direct API access to structured data. A fully grounded system might combine multiple techniques: using RAG for internal documents, web search for current events, and tool use for structured data queries.

AWS Prescriptive Guidance describes grounding as the broader process of anchoring model outputs to accurate, retrieved information, with RAG serving as the technical pipeline that performs the retrieval and augmentation.[13]

## What grounding APIs and platforms exist?

Several cloud platforms offer grounding as a managed service, reducing the engineering effort required to build grounded AI applications.

### Google Vertex AI grounding

Google provides multiple grounding options through Vertex AI:[7]

- **Grounding with Google Search**: Connects Gemini models to real-time web content. The model decides when to search, generates queries, executes them, and returns responses with structured citation metadata. Supported across the Gemini 2.0, 2.5, and 3 model families.
- **Grounding with Vertex AI Search**: Grounds responses in enterprise data indexed through Vertex AI Search, useful for internal knowledge bases.
- **Grounding with Google Maps**: Launched for Gemini 3 models, this provides access to spatial data, local business information, and place details for location-aware applications.
- **Custom search API grounding**: Allows grounding against a developer's own search infrastructure.

Billing for Gemini 3 models is per search query executed. Older models are billed per prompt.[7]

### Microsoft Azure groundedness detection

Microsoft Azure AI Content Safety includes a groundedness detection feature that evaluates whether LLM-generated text is supported by provided source material.[9] It operates in two modes:

| Mode | Speed | Output | Best for |
|------|-------|--------|----------|
| Non-Reasoning | Fast | Binary grounded/ungrounded | Real-time production applications |
| Reasoning | Slower | Detailed explanations of ungrounded segments | Development, debugging, root cause analysis |

The service supports domain-specific detection (Medical and Generic domains) and task-specific optimization (Summarization and QnA).[9] A correction feature can automatically rewrite ungrounded text to align with the provided source material. For example, if a summary states "The patient name is Kevin" but the source document says "Jane," the correction feature will fix the discrepancy.[9] Currently, accuracy is optimized for English language content.[9]

### Anthropic Citations API

Anthropic offers a Citations API that lets developers add source documents to Claude's context window.[10] During response generation, Claude automatically cites the exact passages it uses, producing outputs where each claim can be traced to a specific source location.[10] This feature is available for Claude 3.5 and later models.

## How is grounding measured and benchmarked?

Measuring how well a system grounds its outputs is an active area of research. Several metrics and benchmarks have been developed.

### Groundedness metrics

Groundedness (also called faithfulness) measures the degree to which a generated response is supported by the retrieved or provided source documents.[14] It is the inverse of hallucination. Common evaluation approaches include:

- **Claim-level verification**: The response is decomposed into individual claims, and each claim is checked against the source documents.
- **NLI-based scoring**: A [natural language inference](/wiki/natural_language_inference) model classifies each claim as supported, contradicted, or neutral with respect to the source.
- **LLM-as-judge**: A separate language model evaluates whether the response is faithful to the provided context.

Popular open-source evaluation frameworks include RAGAS, TruLens, and DeepEval, each offering automated groundedness scoring for RAG systems.[14]

### FACTS Grounding Leaderboard

The FACTS Grounding benchmark, introduced by [Google DeepMind](/wiki/google_deepmind) and Google Research in December 2024, evaluates LLMs on their ability to generate long-form responses grounded in provided document context.[5] The DeepMind team noted that "large language models (LLMs) are increasingly becoming a primary source for information delivery across diverse use cases, so it's important that their responses are factually accurate."[6] The dataset contains 1,719 examples (860 public, 859 private) with documents up to 32,000 tokens (roughly 20,000 words) covering finance, technology, retail, medicine, and law.[5]

In 2025, Google expanded FACTS into the FACTS Benchmark Suite in collaboration with Kaggle, growing it to 3,513 public examples across four benchmarks:[6]

| Benchmark | What it measures |
|-----------|------------------|
| FACTS Grounding | Ability to ground long-form responses in provided documents |
| FACTS Parametric | Ability to access internal (parametric) knowledge accurately on factoid questions |
| FACTS Search | Ability to use web search as a tool to retrieve and synthesize information |
| FACTS Multimodal | Ability to generate factually accurate text in response to image-based questions |

As of early 2026, Gemini 3 Pro led the leaderboard with a FACTS Score of 68.8%, representing a 55% error rate reduction on the Search benchmark compared to Gemini 2.5 Pro.[6] Notably, no model exceeded a 70% overall FACTS Score, underscoring how far grounded factuality still has to go.[6] The leaderboard is publicly available on Kaggle.

### RAG Triad

The RAG Triad framework evaluates three qualities of retrieval-augmented generation systems: context relevance (are the retrieved documents relevant to the query?), groundedness (is the response faithful to the retrieved context?), and answer relevance (does the response actually answer the question?). These three metrics together provide a view of system performance that covers both the retrieval and generation stages.

## What are the challenges and limitations of grounding?

Despite its effectiveness, grounding in AI faces several practical challenges.

### Retrieval quality

Grounding is only as good as the information retrieved. If the retrieval system returns irrelevant, outdated, or misleading documents, the model may ground its response in poor-quality sources. This problem is compounded in adversarial settings, where manipulated documents could be injected into the retrieval pipeline.

### Latency trade-offs

Grounding adds a retrieval step (or multiple retrieval steps) before generation, increasing response time. For real-time applications like customer support chatbots, this latency can be significant. System architects must balance grounding depth against response speed.

### Context window limitations

Even with large context windows (100,000+ tokens in models like Claude and Gemini), there are limits to how much retrieved context can be provided. When many documents are relevant, the system must decide which to include, and important information may be left out. Models can also exhibit "lost in the middle" effects, paying less attention to information placed in the center of long contexts.

### Knowledge conflicts

When retrieved information contradicts the model's parametric knowledge, the model must decide which source to trust. Research shows that models do not always prefer the retrieved context, sometimes generating responses based on their training data even when the provided documents say otherwise.

### Measurement difficulty

Objectively measuring groundedness remains difficult. Automated metrics (NLI-based, LLM-as-judge) are imperfect proxies for human judgment. Human evaluation is expensive and hard to scale. The interaction between retrieval quality and generation quality makes it challenging to isolate the contribution of grounding specifically.

### Data security and compliance

In enterprise deployments, grounding often involves proprietary or regulated data. Protecting this data from unauthorized access, prompt injection attacks, or inadvertent disclosure through generated responses is a real concern. Organizations must implement access controls, data encryption, and audit trails around their grounding data.

### Static knowledge fallback

Even with grounding, the underlying model's parametric knowledge is frozen at the time of training. If the grounding system fails to retrieve relevant information (due to query formulation issues, index gaps, or connectivity problems), the model falls back on potentially outdated internal knowledge without indicating that its information may be stale.

## Comparison of grounding approaches

The following table compares the major grounding techniques across several dimensions.

| Approach | Data source | Latency impact | Best suited for | Limitations |
|----------|------------|----------------|-----------------|-------------|
| [RAG](/wiki/retrieval_augmented_generation) | Document stores, vector databases | Moderate (retrieval + generation) | Internal knowledge bases, domain-specific QA | Depends on retrieval quality; context window limits |
| Web search grounding | Live web via search engines | Higher (search + parsing + generation) | Current events, general knowledge, fact-checking | Result quality varies; no control over source reliability |
| Tool use / function calling | APIs, databases, code interpreters | Variable (depends on tool) | Structured data queries, calculations, live system access | Requires tool design and API availability |
| [Knowledge graph](/wiki/knowledge_graph) grounding | Structured knowledge graphs | Low to moderate | Entity relationships, multi-hop reasoning | Requires graph construction and maintenance |
| Citation generation | Source documents in context | Minimal (post-processing) | Verifiability, trust, compliance | Does not itself retrieve information; works atop other methods |
| Groundedness detection | Generated text + source comparison | Post-generation check | Quality assurance, safety filtering | Reactive rather than preventive; adds processing step |

## What is grounding used for?

Grounding is applied across a range of industries and use cases:

- **Healthcare**: Medical QA systems grounded in clinical guidelines, drug databases, and peer-reviewed literature to reduce the risk of incorrect medical advice.
- **Legal**: AI assistants grounded in case law databases and regulatory documents to ensure legal accuracy. Early ungrounded LLM deployments in legal contexts produced fabricated case citations, illustrating the consequences of inadequate grounding.
- **Customer support**: Chatbots grounded in product documentation, FAQs, and order management systems to provide accurate, specific answers.
- **Finance**: Trading and advisory tools grounded in market data feeds, SEC filings, and financial news for time-sensitive accuracy.
- **Education**: Tutoring systems grounded in textbooks and curricula to provide factually correct explanations.
- **Search engines**: AI-powered search products like Perplexity, Google AI Overviews, and ChatGPT Search that ground responses in web search results.

## Future directions

Research in grounding continues to advance along several fronts. Multi-agent verification systems assign different roles (content generation, fact checking, citation verification, logical consistency review) to separate agents, creating layered defenses against hallucination. Improvements in retrieval models, including learned sparse retrievers and multi-vector dense retrievers, aim to improve the quality of documents provided to the generator. Work on long-context models seeks to expand the amount of grounding information that can be provided in a single prompt.

The development of standardized benchmarks like FACTS and evaluation frameworks like RAGAS is driving more rigorous comparison of grounding techniques. As LLMs are deployed in higher-stakes settings (medicine, law, finance), the demand for robust, measurable grounding will continue to grow.

Agentic grounding is another active area, where [AI agents](/wiki/ai_agents) autonomously decide when and how to retrieve information, performing multi-step research workflows that involve searching, reading, evaluating source quality, and synthesizing information from diverse sources. Google has also introduced "high-fidelity grounding" modes where the model itself is adapted (not just the retrieval pipeline) to produce more factual responses when grounding is enabled.[7]

## See also

- [Retrieval-augmented generation](/wiki/retrieval_augmented_generation)
- [Hallucination](/wiki/hallucination)
- [Large language models](/wiki/large_language_model)
- [Tool use](/wiki/tool_use)
- [Object detection](/wiki/object_detection)
- [Prompt engineering](/wiki/prompt_engineering)
- [Embeddings](/wiki/embeddings)

## References

1. Harnad, S. (1990). "The Symbol Grounding Problem." *Physica D: Nonlinear Phenomena*, 42(1-3), 335-346.
2. Lewis, P., Perez, E., Piktus, A., et al. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." *Advances in Neural Information Processing Systems*, 33.
3. Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., et al. (2023). "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection." *arXiv preprint arXiv:2303.05499*. Published at ECCV 2024.
4. Ren, T., et al. (2024). "Grounding DINO 1.5: Advance the 'Edge' of Open-Set Object Detection." *arXiv preprint arXiv:2405.10300*.
5. Google DeepMind. (2025). "FACTS Grounding: A New Benchmark for Evaluating the Factuality of Large Language Models." *arXiv preprint arXiv:2501.03200*.
6. Google DeepMind. (2026). "FACTS Benchmark Suite: A New Way to Systematically Evaluate LLMs Factuality." *deepmind.google/blog*. https://deepmind.google/blog/facts-benchmark-suite-systematically-evaluating-the-factuality-of-large-language-models/
7. Google Cloud. "Grounding with Google Search." *[Generative AI](/wiki/generative_ai) on Vertex AI Documentation*. https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search
8. Google AI for Developers. "Grounding with Google Search." *Gemini API Documentation*. https://ai.google.dev/gemini-api/docs/google-search
9. Microsoft. "Groundedness Detection in Azure AI Content Safety." *Azure AI Services Documentation*. https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/groundedness
10. Anthropic. (2025). "Introducing Citations on the [Anthropic API](/wiki/anthropic_api)." *Claude Blog*. https://claude.com/blog/introducing-citations-api
11. Anthropic. (2025). "Introducing Web Search on the Anthropic API." *Claude Blog*. https://claude.com/blog/web-search-api
12. OpenAI. (2024). "Introducing ChatGPT Search." *OpenAI Blog*. https://openai.com/index/introducing-chatgpt-search/
13. AWS Prescriptive Guidance. "Grounding and Retrieval Augmented Generation." https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-serverless/grounding-and-rag.html
14. Deepset. "Measuring LLM Groundedness in RAG Systems with Evaluation Metrics." *Deepset Blog*. https://www.deepset.ai/blog/rag-llm-evaluation-groundedness
15. Chen, Y., et al. (2024). "Citation-Grounded Code Comprehension: Preventing LLM Hallucination Through Hybrid Retrieval and Graph-Augmented Context." *arXiv preprint arXiv:2512.12117*.
16. Zhou, W., et al. (2024). "Towards Visual Grounding: A Survey." *arXiv preprint arXiv:2412.20206*.
17. IDEA Research. "Grounding DINO." GitHub repository. https://github.com/IDEA-Research/GroundingDINO

