# Search Engine

> Source: https://aiwiki.ai/wiki/search_engine
> Updated: 2026-06-23
> Categories: AI Tools & Products, Information Retrieval
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

A **search engine** is a software system that retrieves information from a corpus (the web, a private dataset, or a document store) and ranks results by relevance to a user query. Since 2022 the dominant shift in search has been from keyword-matched lists of links toward AI **answer engines** that synthesize a single cited response: Google AI Overviews, the AI-native [Perplexity](/wiki/perplexity), [OpenAI](/wiki/openai) ChatGPT Search, and Google AI Mode. Google reported that AI Overviews reached over 2 billion monthly users across more than 200 countries and 40 languages by July 2025, making AI-generated answers a default part of mainstream web search. [17]

Modern search engines combine classical information retrieval techniques with [machine learning](/wiki/large_language_model) and [neural networks](/wiki/transformer) to interpret queries, score documents, and increasingly generate direct answers. The technical engine behind the answer-engine shift is [retrieval-augmented generation](/wiki/retrieval_augmented_generation), which fuses a retriever with a [large language model](/wiki/large_language_model) to produce synthesized, cited responses rather than ten blue links.

This article covers the role of artificial intelligence in search: ranking algorithms, vector and semantic retrieval, AI-native answer engines, the keyword-search-to-answer-engine paradigm shift, its impact on web traffic and SEO, and supporting infrastructure.

## History of search and the role of AI

Early web search engines including WebCrawler (1994), Lycos (1994), AltaVista (1995), and Yahoo Directory relied on keyword matching, manual curation, and lexical scoring functions such as TF-IDF and BM25. In 1998, Stanford graduate students Larry Page and Sergey Brin published *The Anatomy of a Large-Scale Hypertextual Web Search Engine* and the companion technical report introducing PageRank, an eigenvector-based algorithm that ranks pages by the structure of inbound links. PageRank powered the launch of [Google](/wiki/google) and remained a central signal in web ranking for two decades.

The mid-2000s brought the learning-to-rank era. At Microsoft Research, Christopher Burges and colleagues developed RankNet (ICML 2005), then LambdaRank, then LambdaMART, a gradient-boosted tree model that won Track 1 of the 2010 Yahoo Learning to Rank Challenge and remains a standard baseline for web ranking.

Neural information retrieval emerged in the early 2010s with the Deep Structured Semantic Model (DSSM) by Huang and colleagues at Microsoft (CIKM 2013), which mapped queries and documents into a shared embedding space. On October 25, 2019, [Google](/wiki/google) announced that it was applying [BERT](/wiki/bert) (Bidirectional Encoder Representations from Transformers) to ranking and featured snippets, estimating it would affect about one in ten English queries.

Dense retrieval became practical in 2020. Karpukhin and colleagues at Facebook AI published Dense Passage Retrieval (DPR, arXiv 2004.04906, EMNLP 2020), showing that a dual BERT encoder outperformed Lucene BM25 by 9 to 19 absolute points on open-domain question answering. The same year, Omar Khattab and Matei Zaharia introduced ColBERT (SIGIR 2020, arXiv 2004.12832), whose late-interaction architecture preserved fine-grained query-document matching while keeping computation tractable. Patrick Lewis and colleagues then proposed retrieval-augmented generation (NeurIPS 2020, arXiv 2005.11401), combining a dense retriever with a sequence-to-sequence generator and establishing the architectural template for today's AI search products.

## AI techniques in search

### Query understanding

Before retrieval, search systems analyze the query for intent. Common steps include spell correction, query rewriting and expansion, intent classification (navigational, informational, transactional), entity linking against a [knowledge graph](/wiki/knowledge_graph), and language identification. [Google](/wiki/google) layered RankBrain (2015) and BERT (2019) on top of older lexical pipelines to handle conversational and long-tail queries.

### Retrieval

Retrieval is the first pass that selects a candidate set from millions or billions of documents:

- **Lexical**: BM25, Okapi, TF-IDF, and language modeling over an inverted index. BM25 remains a strong out-of-domain baseline.
- **Dense neural**: bi-encoders such as DPR, Sentence-BERT, GTR, E5, BGE, and Cohere or [OpenAI](/wiki/openai) [embeddings](/wiki/embeddings). Documents are encoded once; queries are encoded at runtime and matched via approximate nearest neighbor search.
- **Late interaction**: ColBERT and ColBERTv2 (NAACL 2022) keep per-token representations and compute MaxSim at query time.
- **Learned sparse**: SPLADE (Formal, Piwowarski, and Clinchant, SIGIR 2021), SPLADE v2, and uniCOIL produce sparse weighted representations that fit in inverted indexes.
- **Hybrid**: combining BM25 with a dense or learned-sparse signal typically outperforms any single method, especially out of domain.

### Ranking

A second-stage reranker scores the top-k candidates more carefully. Gradient-boosted trees such as LambdaMART still dominate production web ranking. Neural rerankers include cross-encoders, monoT5 and RankT5 (Pradeep, Nogueira, and Lin, 2021 to 2022), and RankGPT (Sun and colleagues, EMNLP 2023), which prompts an [LLM](/wiki/large_language_model) to compare passages.

### Answer generation

Results can be returned as links, extractive snippets (a passage taken verbatim), or abstractive answers produced by an [LLM](/wiki/large_language_model) conditioned on retrieved passages. The latter is the core of [retrieval-augmented generation](/wiki/retrieval_augmented_generation) and underpins answer engines such as Perplexity and ChatGPT Search.

### Index data structures

Lexical search uses inverted indexes. Vector search depends on approximate nearest neighbor (ANN) indexes. Yu. A. Malkov and D. A. Yashunin formalized Hierarchical Navigable Small World graphs ([HNSW](/wiki/hnsw)) in 2016, with the journal version in IEEE TPAMI in 2018. [FAISS](/wiki/faiss), released by Facebook AI Research in March 2017 with Johnson, Douze, and Jegou's billion-scale paper (arXiv 1702.08734), provides CPU and GPU implementations of IVF, PQ, and HNSW. Other ANN libraries include Annoy (Spotify), ScaNN (Google, ICML 2020), and DiskANN (Microsoft, NeurIPS 2019).

## What is the shift from search engines to answer engines?

The answer-engine shift is the move from search engines that return a ranked list of links to AI systems that read across multiple sources and return one synthesized, cited answer in natural language. A 2024 ACM SIGKDD paper that introduced the term *generative engine* defined these systems as engines that "satisfy queries by synthesizing information from multiple sources and summarizing them using an LLM." [21] Instead of clicking through to a website, the user reads the model's answer directly on the results page, and the underlying sources become citations rather than destinations.

The shift accelerated quickly across the major operators:

| Answer engine | Operator | Reported reach or volume | As of |
| --- | --- | --- | --- |
| Google AI Overviews | Google | Over 2 billion monthly users, 200+ countries, 40 languages [17] | July 2025 |
| Google AI Mode | Google | About 100 million monthly users (US and India) [17] | July 2025 |
| ChatGPT (incl. Search) | OpenAI | About 2.5 billion queries per day; 800 million weekly users [18] | mid-2025 |
| Perplexity | Perplexity | 780 million queries in May 2025, growing 20%+ month over month [20] | May 2025 |

Google introduced AI Mode as a US Search Labs experiment in March 2025 and began a broader US rollout in May 2025, then expanded to additional markets through the rest of the year. [19] AI Mode is a fully conversational, [Gemini](/wiki/google_gemini)-powered interface that issues multiple sub-queries (a technique Google calls query fan-out) and returns an extended generative answer, marking Google's clearest step away from the classic ten-blue-links page.

## AI-powered search engines

A wave of AI-native search products launched between 2022 and 2025.

| Product | Operator | Launched | Notes |
| --- | --- | --- | --- |
| Perplexity AI | Perplexity | Dec 2022 | Founded Aug 2022 by Aravind Srinivas, Denis Yarats, Johnny Ho, Andy Konwinski |
| You.com | You.com Inc. | Nov 2021 beta | Founded 2020 by Richard Socher and Bryan McCann; LLM chat with live web results from Dec 2022 |
| Bing Chat / Copilot | Microsoft | Feb 7, 2023 | First mainstream LLM-backed web search; OpenAI GPT-4; later rebranded Microsoft Copilot |
| ChatGPT Search | OpenAI | Oct 31, 2024 | Evolved from the SearchGPT prototype announced July 2024 |
| Google AI Overviews | Google | May 14, 2024 (US) | Previewed as SGE at Google I/O May 10, 2023; expanded to 100+ countries Oct 2024 |
| Google AI Mode | Google | May 2025 (US, after Mar 2025 Labs) | Conversational, Gemini-powered; query fan-out |
| Google Gemini in Search | Google | 2024 onward | [Gemini](/wiki/google_gemini) models power AI Overviews |
| Brave Search | Brave Software | 2021 beta | Independent index; AI summarizer added 2023 |
| Kagi | Kagi Inc. | 2022 beta | Subscription, ad-free; FastGPT and Kagi Assistant add LLM features |
| Phind | Phind | 2022 | Developer-focused technical Q&A |
| Exa (formerly Metaphor) | Exa Labs | 2022 | Semantic search API |
| Tavily | Tavily | 2023 | Developer search API for [LLM](/wiki/large_language_model) agents |
| Andi | Andi Search | 2022 | Conversational, privacy-oriented |
| Komo | Komo Search | 2022 | AI-native consumer search |

The industry has split into three patterns: traditional engines return ranked links, AI-native engines synthesize an answer with inline citations, and hybrids (such as Google Search after May 2024) show an AI Overview above traditional results.

## Retrieval-augmented generation

RAG is the dominant architecture for AI-powered search. Given a query, the system retrieves a small number of relevant passages (via dense, sparse, or hybrid retrieval), formats them into a prompt, and asks an [LLM](/wiki/large_language_model) to produce an answer grounded in those passages with citations. Retrieval reduces hallucination by anchoring generation to up-to-date evidence. Tradeoffs include retrieval errors propagating into answers, citation drift, and added latency. RAG is the operating principle behind Perplexity, ChatGPT Search, Bing Copilot, AI Overviews, and enterprise systems such as [Glean](/wiki/glean).

## Vector databases

Production RAG and [semantic search](/wiki/semantic_search) typically rely on a [vector database](/wiki/vector_database) or vector-capable engine.

| System | License | Deployment | Notes |
| --- | --- | --- | --- |
| [Pinecone](/wiki/pinecone) | Proprietary | Managed cloud | Serverless tier since 2024 |
| [Weaviate](/wiki/weaviate) | Open source | Self-host or managed | Hybrid search; embedding modules |
| [Milvus](/wiki/milvus) / Zilliz | Open source / managed | Self-host or managed | LF AI & Data; billion-scale deployments |
| [Qdrant](/wiki/qdrant) | Open source | Self-host or managed | Rust; payload filtering and quantization |
| Chroma | Open source | Embedded or server | Popular with [LangChain](/wiki/langchain) and [LlamaIndex](/wiki/llamaindex) |
| pgvector | Open source | Postgres extension | HNSW and IVFFlat inside Postgres |
| Vespa | Open source | Self-host or cloud | Yahoo origin; native ANN |
| Elasticsearch / OpenSearch | Apache 2.0 / SSPL | Self-host or managed | Dense and sparse vectors; ELSER |
| MongoDB Atlas Vector Search | Proprietary | Managed | Built on Atlas |
| Redis Vector | Source-available | Self-host or managed | Modules for vector and hybrid search |
| LanceDB | Open source | Embedded or cloud | Columnar Lance format |
| [FAISS](/wiki/faiss) | MIT | Library | Library, not a database |

The choice depends on scale, hybrid query needs, and existing infrastructure. Hosted services reduce operations; open-source engines offer more control at scale.

## Search APIs for developers

Providers that expose web search as an API for RAG pipelines and AI agents include Google Custom Search JSON API, Brave Search API, Exa, [Tavily](/wiki/tavily), Serper, SerpAPI, You.com API, and Bing Web Search (Microsoft announced its retirement effective August 2025, pushing customers toward Bing Grounding for Azure AI). The shift from human-facing to agent-facing search APIs is a notable 2024 to 2025 trend.

## Benchmarks

Benchmarks underpin progress in retrieval. The most widely used are listed below.

| Benchmark | Year | Scope | Reference |
| --- | --- | --- | --- |
| MS MARCO | 2016 | Large-scale passage ranking and QA from Bing logs | Nguyen et al., arXiv 1611.09268 |
| TREC Deep Learning Track | 2019 onward | Annual NIST evaluation of deep retrieval | NIST TREC |
| BEIR | 2021 | Zero-shot retrieval across 18 datasets | Thakur et al., NeurIPS Datasets and Benchmarks, arXiv 2104.08663 |
| MTEB (retrieval split) | 2022 | Embedding model evaluation, includes retrieval tasks | Muennighoff et al., arXiv 2210.07316 |
| MIRACL | 2022 | Multilingual ad hoc retrieval (18 languages) | Zhang et al., arXiv 2210.09984 |
| LoTTE | 2022 | Long-tail topic retrieval, paired with ColBERTv2 | Santhanam et al., NAACL 2022 |

A recurring finding from BEIR is that BM25 underperforms dense models on in-domain MS MARCO but generalizes better out of domain. Hybrid retrieval typically wins on both fronts.

## Open-source search stacks

Apache Lucene underpins both Elasticsearch and OpenSearch (forked from Elasticsearch in 2021 after Elastic changed its license). Apache Solr is another long-running Lucene-based engine. Vespa, originally built at Yahoo and open-sourced in 2017, ships native tensor and ANN support for large-scale [semantic search](/wiki/semantic_search). Meilisearch, Typesense, and Sonic are lightweight engines for developer ergonomics. [LangChain](/wiki/langchain) and [LlamaIndex](/wiki/llamaindex) are the most-used frameworks for composing retrievers, vector stores, and [LLMs](/wiki/large_language_model) into RAG applications.

## Applications

Search and AI search now power many product categories:

- **Web search**: Google, Bing, Brave, DuckDuckGo, Yandex, Baidu, plus AI-native entrants.
- **Enterprise search**: [Glean](/wiki/glean), Coveo, Algolia, Elastic, AWS Kendra.
- **E-commerce**: Algolia, Coveo, Bloomreach, Constructor, Klevu.
- **Code search**: Sourcegraph Cody, GitHub Copilot, Phind, Cursor.
- **Document Q&A**: Notion AI, Microsoft 365 [Copilot](/wiki/microsoft_copilot), Dropbox Dash, Box AI.
- **Customer support**: Intercom Fin, Zendesk AI, Ada, Decagon.
- **Legal research**: Westlaw Precision AI, Lexis+ AI, Harvey, Casetext (acquired by Thomson Reuters in 2023).
- **Scientific search**: Semantic Scholar, Elicit, Consensus, OpenEvidence.

## How does AI search affect web traffic and SEO?

AI answer engines are reshaping the economics of the web because the answer appears on the results page, so users no longer need to click through to source sites. A Pew Research Center study of US adults' browsing in March 2025 found that when a Google AI summary appeared, users clicked a traditional search result in only 8 percent of visits, compared with 15 percent of visits when no summary was shown, and clicked the link inside the AI summary itself in just 1 percent of visits. [22] Pew also found that users were more likely to end their session entirely after seeing an AI summary (26 percent versus 16 percent without one). Google disputed the study, calling its query set unrepresentative of real Search traffic. [22]

This decline of referral clicks, often called the zero-click search problem, has driven publisher lawsuits and licensing deals. The New York Times sued OpenAI and Microsoft in December 2023 over training-data use, several publishers blocked AI crawlers via robots.txt, and OpenAI signed content deals with Axel Springer, News Corp, the Financial Times, and others starting in 2023.

### What is generative engine optimization (GEO)?

Generative engine optimization (GEO) is the practice of structuring web content so that AI answer engines are more likely to cite it, the answer-engine successor to classical search engine optimization (SEO). The term was introduced in *GEO: Generative Engine Optimization*, a paper by Pranjal Aggarwal and colleagues at Princeton presented at ACM SIGKDD (KDD) in 2024. [21] The authors built a benchmark of 10,000 queries and reported that their methods "can boost visibility by up to 40% in generative engine responses." [21]

The paper found that the most effective tactics required only minimal content changes: adding relevant statistics, incorporating credible quotations, and citing authoritative sources produced the largest gains in how often a page was surfaced in a generative answer. [21] These findings have driven a fast-growing GEO industry in 2025 and 2026, paralleling the older SEO market and prompting traditional SEO firms to add answer-engine optimization (AEO) services.

## Modern landscape, 2024 to 2026

Between 2024 and 2026 the search market restructured rapidly. Google rolled out AI Overviews in the United States on May 14, 2024 and expanded to more than 100 countries in October 2024, reaching over 2 billion monthly users by July 2025. [17] OpenAI launched ChatGPT Search on October 31, 2024 and removed the login requirement in 2025, by which point ChatGPT was handling roughly 2.5 billion queries per day. [18] Microsoft folded Bing Chat into a broader [Copilot](/wiki/microsoft_copilot) brand. Perplexity reported 780 million queries in May 2025 and was valued at about 20 billion dollars in a September 2025 funding round. [20] Anthropic, OpenAI, and Google introduced deep research modes that combine multi-step browsing with [LLM](/wiki/large_language_model) synthesis. Agentic search, in which a model issues sub-queries, opens pages, and refines plans, became standard in flagship products.

Legal and economic disputes followed. The New York Times sued OpenAI and Microsoft in December 2023 over training data use. Several publishers blocked AI crawlers via robots.txt, and reports showed AI Overviews and answer engines reduced click-through rates to source sites in 2024 and 2025.

## Limitations and concerns

AI search inherits issues from both classical retrieval and [large language models](/wiki/large_language_model):

- **[Hallucination](/wiki/hallucination)**: generated answers can be fluent but unsupported by the cited sources. Studies of Perplexity, Bing Copilot, and AI Overviews in 2023 to 2025 reported nontrivial rates of incorrect claims.
- **Citation accuracy**: a cited URL may not actually support the sentence attached to it.
- **Source quality**: AI engines can amplify low-quality or AI-generated content, especially on long-tail queries.
- **SEO arms race**: spam farms, generative SEO content, and prompt-injection attempts now target AI Overviews and answer engines.
- **Publisher economics**: zero-click answers shrink referral traffic, motivating lawsuits and licensing deals (OpenAI with Axel Springer, News Corp, the Financial Times, and others starting 2023).
- **Cost and latency**: LLM inference is far more expensive per query than classical ranking.
- **Evaluation**: scoring an open-ended generative answer is harder than judging a ranked list.

## See also

- [Retrieval-augmented generation](/wiki/retrieval_augmented_generation)
- [Vector database](/wiki/vector_database)
- [Semantic search](/wiki/semantic_search)
- [Information retrieval](/wiki/information_retrieval)
- [Perplexity](/wiki/perplexity)
- [Microsoft Copilot](/wiki/microsoft_copilot)
- [Search Engine ChatGPT Plugins](/wiki/search_engine_chatgpt_plugins)

## References

1. Page, L., Brin, S., Motwani, R., and Winograd, T. (1998). *The PageRank Citation Ranking: Bringing Order to the Web*. Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/
2. Brin, S. and Page, L. (1998). *The Anatomy of a Large-Scale Hypertextual Web Search Engine*. Computer Networks and ISDN Systems. http://infolab.stanford.edu/~backrub/google.html
3. Burges, C. J. C. (2010). *From RankNet to LambdaRank to LambdaMART: An Overview*. Microsoft Research Technical Report MSR-TR-2010-82. https://www.microsoft.com/en-us/research/publication/from-ranknet-to-lambdarank-to-lambdamart-an-overview/
4. Nayak, P. (Google). *Understanding searches better than ever before*. October 25, 2019. https://blog.google/products/search/search-language-understanding-bert/
5. Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., and Yih, W. (2020). *Dense Passage Retrieval for Open-Domain Question Answering*. EMNLP 2020. https://arxiv.org/abs/2004.04906
6. Khattab, O. and Zaharia, M. (2020). *ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT*. SIGIR 2020. https://arxiv.org/abs/2004.12832
7. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Kuettler, H., Lewis, M., Yih, W., Rocktaeschel, T., Riedel, S., and Kiela, D. (2020). *Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks*. NeurIPS 2020. https://arxiv.org/abs/2005.11401
8. Malkov, Y. A. and Yashunin, D. A. (2018). *Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs*. IEEE TPAMI. https://arxiv.org/abs/1603.09320
9. Johnson, J., Douze, M., and Jegou, H. (2017). *Billion-Scale Similarity Search with GPUs*. IEEE Transactions on Big Data. https://arxiv.org/abs/1702.08734
10. Formal, T., Piwowarski, B., and Clinchant, S. (2021). *SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking*. SIGIR 2021. https://arxiv.org/abs/2107.05720
11. Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., and Deng, L. (2016). *MS MARCO: A Human Generated MAchine Reading COmprehension Dataset*. https://arxiv.org/abs/1611.09268
12. Thakur, N., Reimers, N., Rücklé, A., Srivastava, A., and Gurevych, I. (2021). *BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models*. NeurIPS Datasets and Benchmarks. https://arxiv.org/abs/2104.08663
13. Mehdi, Y. (Microsoft). *Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web*. February 7, 2023. https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/
14. OpenAI. *Introducing ChatGPT search*. October 31, 2024. https://openai.com/index/introducing-chatgpt-search/
15. Reid, L. (Google). *Generative AI in Search: Let Google do the searching for you*. May 14, 2024. https://blog.google/products/search/generative-ai-google-search-may-2024/
16. Perplexity AI. Company page. https://www.perplexity.ai/
17. Wiggers, K. (TechCrunch). *Google's AI Overviews have 2B monthly users, AI Mode 100M in the US and India*. July 23, 2025. https://techcrunch.com/2025/07/23/googles-ai-overviews-have-2b-monthly-users-ai-mode-100m-in-the-us-and-india/
18. TechCrunch / OpenAI usage reporting (2025): ChatGPT reached roughly 800 million weekly active users and about 2.5 billion queries per day in mid-2025. https://techcrunch.com/2026/02/27/chatgpt-reaches-900m-weekly-active-users/
19. Reid, L. (Google). *AI Mode in Google Search: Updates from Google I/O 2025*. May 2025. https://blog.google/products/search/google-search-ai-mode-update/
20. TechCrunch. *Perplexity reportedly raised $200M at $20B valuation*. September 10, 2025. https://techcrunch.com/2025/09/10/perplexity-reportedly-raised-200m-at-20b-valuation/
21. Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., and Deshpande, A. (2024). *GEO: Generative Engine Optimization*. Proceedings of the 30th ACM SIGKDD Conference (KDD '24). https://arxiv.org/abs/2311.09735
22. Pew Research Center. *Google users are less likely to click on links when an AI summary appears in the results*. July 22, 2025. https://www.pewresearch.org/short-reads/2025/07/22/google-users-are-less-likely-to-click-on-links-when-an-ai-summary-appears-in-the-results/

