Exa AI

AI Companies AI Infrastructure Information Retrieval

18 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

22 citations

Revision

v4 · 3,578 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Exa AI (formerly Metaphor) is an artificial intelligence company that builds a search engine designed specifically for AI applications. Unlike traditional search engines that rely on keyword matching and are optimized for human users clicking on links, Exa uses embeddings-based neural search to retrieve web content by semantic meaning. The company provides a Search API, Contents API, Answer API, Find Similar API, and a product called Websets, all targeted at developers building AI agents, retrieval-augmented generation (RAG) pipelines, and research tools. Headquartered in San Francisco, California, Exa was co-founded in 2021 by Will Bryk and Jeff Wang.^[1] As of September 2025, the company has raised over $111 million in total funding at a valuation of $700 million.^[5]^[7]

History

Founding as Metaphor (2021)

Will Bryk and Jeff Wang met as roommates at Harvard University, where Bryk studied computer science and physics and Wang studied computer science and philosophy. While at Harvard, Wang ran a GPU cluster in his dorm room. The two first attempted to build a search engine using crowdsourcing to compete with Google, but later recognized that large language models and neural networks enabled a fundamentally more powerful approach.^[17]

After graduation, Bryk worked as a software engineer at Cresta, a machine learning startup, while Wang spent three years building data and web infrastructure at Plaid. When GPT-3 launched in 2020, the pair had a key insight: pretraining for large language models and indexing for search engines are remarkably similar processes, as both involve code looking at all the text on the internet and compressing it into a better representation.^[16] This realization led them to found the company in 2021 under the name Metaphor Systems.^[17]

Metaphor was accepted into Y Combinator's Winter 2022 batch (YC W22).^[8] The company launched its first search engine in November 2022, coinciding with the release of ChatGPT, which accelerated demand from AI developers seeking programmatic access to web search.

Rebrand to Exa (January 2024)

In January 2024, the company rebranded from Metaphor to Exa.^[2] The new name was chosen to better reflect the company's mission to "organize the world's knowledge." Alongside the rebrand, Exa unveiled several new features, including the highlights capability, which allows users to extract relevant excerpts from webpages using a customizable embedding model.^[2] The company chose the .ai domain (exa.ai) to signal its belief that AI systems would become the dominant consumers of search engines in the future.

Technology

Neural Search and Embeddings

Exa's core technology is built on embeddings-based retrieval rather than traditional keyword matching. While conventional search engines like Google preprocess documents into keyword indexes and use algorithms like PageRank to rank results, Exa trains specialized transformer models to convert each webpage into a numerical vector (an embedding) that captures the semantic meaning of the content. These embeddings encode deep information about a document, including its arguments, writing style, topics, and even keywords, in a way that is more expressive than text-based indexing.^[13]

When a user submits a query, Exa converts the query into an embedding using the same model and then performs a nearest-neighbor search in the embedding space to find the most semantically relevant documents. This approach is sometimes described as "next-link prediction," because the model is trained to predict which web links are most relevant based on semantic meaning rather than direct word matches. The result is that Exa can handle complex, natural-language queries, including entire paragraphs or documents, and return results that match the intent of the query even when the relevant pages do not contain the exact words used.^[13]

Exa describes itself as "the first web-scale neural search engine" that uses end-to-end transformer technology (the same class of architecture behind GPT and ChatGPT) to filter by meaning rather than keywords.^[9]

Crawling and Indexing

Exa operates its own web crawling infrastructure separate from any third-party search index. The crawling system continuously discovers new URLs, crawls them across a distributed network of machines and IP addresses, runs each document through a custom-built HTML parser, and stores the results in Amazon S3. As of 2025, Exa crawls and parses tens of billions of webpages and refreshes its index every minute.^[13]

The company prioritizes indexing high-quality content rather than attempting to index the entire web. After gathering billions of documents, Exa preprocesses each document for retrieval by running it through its transformer-based embedding model. This preprocessing step converts each page into one or more embeddings that are stored in Exa's custom-built vector database.

Custom Vector Database

Because no existing vector database met Exa's requirements for cost, throughput, and variable compute per query, the company built its own from scratch. The system is optimized for serving many queries per second at low cost while allowing variable amounts of compute to be allocated per query (for example, spending more compute on harder searches). The engineering team optimized the database at every level, from high-level clustering algorithms and lexical compression techniques down to low-level assembly operations. Key components of the vector database were implemented in Rust for performance.^[12]

The Exacluster

Exa operates a dedicated GPU cluster called the Exacluster, which powers both model training and inference. The cluster consists of 144 NVIDIA H200 GPUs spread across 18 8-way GPU servers.^[10]^[11] Its hardware specifications include:

Component	Specification
GPUs	144 NVIDIA H200
GPU Memory	20 TB HBM3E (141 GB per GPU)
Compute Performance	~570 PetaTOPS combined
CPU Cores	3,456 cores (192 x 96-core processors)
System Memory	36 TB DDR5
Storage	270 TB NVMe SSD

The Exacluster was one of the first clusters in the industry built on NVIDIA's H200 Hopper GPUs.^[10]^[11] The company used the cluster to pretrain and fine-tune its embedding model over a one-month training run, incorporating embedding techniques the team developed over the preceding six months.^[10] With the proceeds from its Series B funding round, Exa plans to expand the Exacluster by 5x.^[5]

The Bitter Lesson Applied to Search

Exa's technical philosophy draws on Rich Sutton's "The Bitter Lesson," which argues that methods leveraging increased computation tend to outperform hand-engineered approaches over time. Unlike Google's keyword-based retrieval system, which cannot fundamentally improve with additional compute, Exa's embedding-based approach scales with computational resources: training larger models on more data with more GPUs directly improves retrieval quality. This is a core reason the company invests heavily in GPU infrastructure.^[16]

Products and API

Search API

The Search API is Exa's primary product. It accepts a natural-language query and returns a ranked list of relevant web results. The API supports several search types:

Search Type	Description	Latency (P50)
Auto (default)	Intelligently combines neural and other search methods for optimal results	Varies
Neural	Pure embeddings-based semantic search	~1.2 seconds
Fast	Streamlined search models optimized for speed	Sub-350 ms
Deep	Agentic search that iteratively retrieves and processes results for maximum quality	~3.5 seconds
Deep Reasoning	Base deep search with reasoning capabilities	Varies
Instant	Lowest latency search optimized for real-time applications	Sub-200 ms

Exa Fast is described as the fastest search API available, achieving sub-350 ms end-to-end P50 latency, which the company claims is 30% faster than the next fastest competitor.^[12] Exa Deep, on the other end of the spectrum, agentically searches, processes, and re-searches until it finds the highest quality information, making it suitable for research-intensive applications where latency is less critical.

The Search API also supports advanced filters for domain, date range, semantic category (such as "company," "research paper," or "news"), and content type. Developers can pass very long queries, including entire paragraphs, to find semantically similar content.

Contents API

The Contents API retrieves the full, cleaned HTML or text content of webpages returned by the Search API. It is useful for applications that need to read and process the actual content of search results, such as RAG pipelines that feed retrieved documents to a large language model.

A highlights feature extracts the most relevant excerpts from pages for a given query. Highlights use a paragraph prediction model to chunk and embed full webpages, then return the most semantically relevant sections. This feature can reduce token budgets and LLM costs by over 50% compared to passing entire pages to a model.

As of March 2026, contents for 10 search results per request are included for free with every Search API call.^[15]

Find Similar API

The Find Similar API accepts a URL and returns a list of pages that are semantically similar to the provided page. This is useful for content discovery, competitive analysis, and recommendation systems. Because Exa's search is based on embeddings, the similarity comparison captures meaning and topic rather than surface-level keyword overlap.

Answer API

The Answer API combines Exa's search capabilities with large language model generation. Given a question, it performs an Exa search, retrieves relevant content, and uses an LLM to generate either a direct answer (for factual queries) or a detailed summary with citations (for open-ended queries). The endpoint supports streaming, returning tokens as they are generated.

Exa Research

Launched in June 2025, Exa Research is an agentic search product that automates iterative querying, reading, clustering, and summarization. Unlike the standard Search API, which returns a list of relevant documents, Exa Research outputs structured insight summaries clustered by topic or theme, often with quote extraction and citation traceability. The Research endpoint achieves 94.9% accuracy on the SimpleQA benchmark.^[18]

Websets

Websets is Exa's high-compute search product designed for exhaustive, long-horizon information retrieval. Users describe what they are looking for in plain English, and Websets agentically sources results and enriches them with custom data columns.^[19] Key features include:

Retrieval of hundreds or thousands of results per query
Custom enrichment columns using AI prompts (such as verified emails, company details, recent news)
Structured output with criteria columns and pass/fail results for each row
CSV export and API integration with CRM tools, Clay, and outbound sequencing workflows
Asynchronous batch-style execution, with queries taking seconds to minutes depending on complexity

Websets is particularly suited for use cases such as lead generation, market mapping, curated dataset creation, and enterprise research workflows.

Specialized Search Indexes

Exa maintains several specialized search indexes beyond its general web index:

Index	Coverage	Update Frequency
People Search	1B+ LinkedIn profiles	50M+ updates per week
Company Search	70M+ companies	Continuous
Code Search	Billions of GitHub repos, docs, Stack Overflow posts	Continuous

The People Search index allows queries like "VP of Product at Microsoft" or "enterprise sales reps from Microsoft in EMEA," returning results that can be programmatically enriched with profile data for sales, recruiting, and market research workflows. The Company Search index provides structured attributes including industry, geography, employee count, and funding data.

Developer Experience

SDKs and Installation

Exa provides official SDKs for Python and JavaScript/TypeScript:

Language	Package	Install Command
Python	exa-py	`pip install exa-py`
JavaScript/TypeScript	exa-js	`npm install exa-js`

Both SDKs provide access to all Exa endpoints (search, find_similar, get_contents, answer, and streaming_answer) with full type hints and async support. The Python SDK includes type hints, and the JavaScript SDK includes TypeScript type definitions.

MCP Server

Exa provides a Model Context Protocol (MCP) server that connects AI assistants to Exa's search capabilities. The MCP server supports Claude Desktop, Cursor, VS Code, and over 10 other AI assistants.^[20] Available tools through the MCP server include:

web_search_exa for general web search
get_code_context_exa for code-related searches
crawling_exa for web crawling
company_research_exa for company research
linkedin_search_exa for people search
deep_researcher_start and deep_researcher_check for agentic deep research

The MCP server can be installed globally with npm install -g exa-mcp-server or used directly with npx mcp-remote https://mcp.exa.ai/mcp.

API Authentication

Developers obtain an API key from the Exa Dashboard and pass it as a bearer token with each request. The API supports both REST endpoints and the official SDKs.

Pricing

Exa offers a free tier and several paid plans.^[14] The pricing structure was simplified in March 2026.^[15]

API Pricing (as of March 2026)

Item	Price
Search (includes 10 results with contents)	$7 per 1,000 requests
Additional results beyond 10	$1 per 1,000 results
Summaries	$1 per 1,000 summaries
Exa Deep	20% discount from prior pricing

Plans

Plan	Monthly Price	Credits	Key Features
Free	$0	$10 in credits (no expiration)	No credit card required, approximately 2,000 searches
Starter	$49	8,000 credits	1 seat, up to 100 results per Webset, 10 enrichment columns, 2 concurrent searches
Pro	$449	100,000 credits	10 seats, up to 1,000 results per Webset, 50 enrichment columns, 10 concurrent searches
Enterprise	Custom	Custom	Unlimited results, seats, and enrichment columns; volume discounts; enterprise-grade security and support

Exa does not serve ads. All revenue comes from API sales and Websets subscriptions, which the company says ensures the search engine is optimized for quality and relevance rather than advertising metrics.

Use Cases

Retrieval-Augmented Generation (RAG)

One of Exa's primary use cases is serving as the retrieval layer in RAG pipelines. In a typical RAG architecture, a large language model receives relevant documents retrieved from an external source before generating a response. Exa's semantic search provides higher-quality, more relevant source documents compared to keyword-based retrieval, which reduces hallucination risks by ensuring models work from verified sources rather than parametric memory alone.

AI Agents

Exa is widely used as a search tool within AI agent frameworks. When an AI agent needs to look up information from the web as part of a multi-step reasoning process, it can call the Exa Search API to get semantically relevant results. Exa's low-latency Fast endpoint is particularly suited for agentic workflows where the agent may need to make many search calls in rapid succession.

Company and Market Research

Using Exa's Company Search index and Websets product, users can obtain structured research on organizations, including business models, competitors, recent funding, and key personnel. Private equity firms, consulting companies, and venture capital firms use Exa for deal sourcing and market mapping.

Sales and Lead Generation

The People Search index and Websets enrichment capabilities support sales prospecting workflows. Users can search for professionals by role, company, geography, and skills, then enrich results with verified emails and company details for outbound outreach.

News Monitoring and Summarization

Developers use Exa's Search API with date filters to retrieve recent news articles on specific topics. Combined with the Answer API or an external LLM, this enables automated news monitoring and summarization applications.

Academic and Scientific Research

Exa's semantic search is useful for finding research papers and academic sources. The ability to search by meaning rather than exact keywords helps researchers discover relevant papers that use different terminology for similar concepts.

Notable Customers

Exa serves thousands of companies and developers. Notable customers include:

Customer	Use Case
Cursor	Code search across millions of GitHub repos, docs, and Stack Overflow for AI-powered code editing
Databricks	Research and information retrieval
Amazon Web Services	Enterprise search integration
Vercel	Developer tooling
Notion	AI agents using Exa's web index
Point72	Financial data research using Exa's index of 70M+ companies
HubSpot	Monitoring updates across 1B+ people and companies

Funding and Valuation

Exa has raised a total of over $111 million across three funding rounds.^[22]

Date	Round	Amount	Valuation	Lead Investor	Other Investors
2022	Seed	~$5M	Undisclosed	Undisclosed	Y Combinator
July 2024	Series A	$17M ($22M cumulative)^[4]	Undisclosed	Lightspeed Venture Partners	NVentures (NVIDIA), Y Combinator
September 2025	Series B	$85M ($111M cumulative)^[5]	$700M^[7]	Benchmark	Lightspeed, Y Combinator, NVentures (NVIDIA)

Peter Fenton of Benchmark joined Exa's board following the Series B round.^[5] The company's Series A was notable for including NVentures, NVIDIA's venture capital arm, as an investor, reflecting the strategic importance of GPU infrastructure to Exa's business.^[4]

Exa is advised by researchers from OpenAI, Google, and Bing.

Founders and Leadership

Name	Role	Background
Will Bryk	CEO and Co-founder	Studied CS and physics at Harvard. Former software engineer at Cresta (ML startup). Grew up in New York City.
Jeff Wang	Co-founder	Studied CS and philosophy at Harvard. Spent three years building data and web infrastructure at Plaid. Ran a GPU cluster in his Harvard dorm room.

As of 2025, Exa has approximately 30 or more employees across engineering, product, and operations, all based in San Francisco.^[22]

Competitive Landscape

Exa operates in the growing market for AI-native search APIs. Its competitors include both other AI search startups and traditional search API providers.^[21]

Competitor	Approach	Strengths	Pricing
Tavily	AI-ranked search aggregation	Quick answer synthesis, transparent RAG pricing	~$0.008 per request
SerpAPI	Wrapper around Google and 20+ search engines	Broadest engine coverage, structured SERP data	~$0.015 per search
Perplexity Sonar API	LLM-generated answers with citations	End-to-end answer generation with sources	$1-$15 per 1M tokens
Brave Search API	Independent index with privacy focus	30B+ page index, no tracking, low cost	$3 per 1,000 queries
Google Custom Search API	Google's search index	Largest web index	$5 per 1,000 queries (first 100/day free)
You.com	Multi-modal AI search	Image search, privacy features	Varies
Firecrawl	Web scraping and content extraction	Deep extraction, LangChain/LlamaIndex adapters	Varies

Exa differentiates itself from these competitors in several ways. Unlike SerpAPI and Google Custom Search, which are wrappers around existing keyword-based search engines, Exa runs its own neural search index. Unlike Perplexity's Sonar API, which returns generated answers, Exa returns raw search results and webpage content, giving developers more control over how results are processed. Unlike Tavily, which aggregates content from multiple sources, Exa performs semantic search over its own proprietary index.^[21]

Analysts expect the AI search API market to consolidate around three to five major players, with Exa, Tavily, and Perplexity among the likely survivors.^[21]

Exa API 2.0

In mid-2025, Exa announced version 2.0 of its search endpoints, representing a major upgrade to the platform's capabilities. The three headline features were Exa Fast, Exa Auto, and Exa Deep.^[12]

To build Exa 2.0, the company significantly expanded its web index to tens of billions of pages with minute-level refresh cycles. The embedding model was retrained from scratch on the 144x H200 Exacluster over a one-month period. The vector database received upgrades in Rust, including new clustering algorithms, lexical compression, and assembly-level optimizations.^[12]

Exa 2.0 demonstrated strong performance on multiple benchmarks, including SimpleQA and Frames. The company evaluates its search APIs within a RAG framework using GPT-4 as the RAG model and GPT-4o-mini for grading, ensuring consistent evaluation methodology across tests.^[12]

References

Exa. "About Exa." https://exa.ai/about ↩
Exa Blog. "Announcing Exa: The AI Search Engine with Semantic Search Technology." January 2024. https://exa.ai/blog/announcing-exa ↩
Exa Blog. "Exa Announces Series A Funding for AI Search Technology Development." July 2024. https://exa.ai/blog/series-a
TechCrunch. "Exa raises $17M from Lightspeed, Nvidia, Y Combinator to build a Google for AIs." July 16, 2024. https://techcrunch.com/2024/07/16/exa-raises-17m-lightspeed-nvidia-ycombinator-google-ai-models/ ↩
Exa Blog. "Exa Raises $85M to Build the Search Engine for AIs." September 2025. https://exa.ai/blog/announcing-series-b ↩
Built In San Francisco. "Exa Raises $85M Series B to Enable High-Quality Search for AIs." September 2025. https://www.builtinsf.com/articles/exa-raises-85m-series-b-20250908
Tech Funding News. "The next Perplexity? Exa raises $85M at $700M valuation to build the 'search engine for AI.'" 2025. https://techfundingnews.com/san-franciscos-exa-raises-85m-at-700m-valuation-to-build-the-search-engine-for-ai/ ↩
Y Combinator. "Exa: Web search rebuilt for LLMs." https://www.ycombinator.com/companies/exa ↩
Exa Blog. "How We're Building the Next Generation of Search with Semantic Search Technology." https://exa.ai/blog/how-to-build-nextgen-search ↩
Exa Blog. "The Exacluster: Powering Our Neural Network Search Engine." https://exa.ai/blog/meet-the-exacluster ↩
Tom's Hardware. "Exacluster reveals one of the industry's first clusters based on Nvidia's H200 Hopper GPUs for AI and HPC." https://www.tomshardware.com/tech-industry/artificial-intelligence/nvidia-hopper-based-100kw-cluster-deploys-with-144-h200-gpus-exacluster-features-192-96-core-cpus-36tb-ddr5-ram-and-270tb-of-nvme-storage ↩
Exa Blog. "Exa API 2.0." 2025. https://exa.ai/blog/exa-api-2-0 ↩
Exa Documentation. "How Exa Search Works." https://docs.exa.ai/reference/how-exa-search-works ↩
Exa. "Pricing." https://exa.ai/pricing ↩
Exa Documentation. "Pricing Update." March 2026. https://exa.ai/docs/changelog/pricing-update ↩
Latent Space Podcast. "Beating Google at Search with Neural PageRank and $5M of H200s, with Will Bryk of Exa.ai." https://www.latent.space/p/exa ↩
Cerebral Valley. "Exa (prev. Metaphor) aims to reshape web search." https://cerebralvalley.ai/blog/exa-aims-to-reshape-web-search-4AOjbOWK9sxxXCKdjQG1EK ↩
Exa Blog. "Introducing Exa Research: Agentic Web Research Agents." June 2025. https://exa.ai/blog/introducing-exa-research ↩
Exa. "Websets." https://exa.ai/websets ↩
Exa. "Exa MCP Server." https://exa.ai/mcp ↩
DEV Community. "Best SERP API Comparison 2025: SerpAPI vs Exa vs Tavily vs ScrapingDog vs ScrapingBee." https://dev.to/ritza/best-serp-api-comparison-2025-serpapi-vs-exa-vs-tavily-vs-scrapingdog-vs-scrapingbee-2jci ↩
Sacra. "Exa revenue, valuation & funding." https://sacra.com/c/exa/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributors · full history

Suggest edit

What links here

opencode (SST)