Exa AI (formerly Metaphor) is an artificial intelligence company that builds a search engine designed specifically for AI applications. Unlike traditional search engines that rely on keyword matching and are optimized for human users clicking on links, Exa uses embeddings-based neural search to retrieve web content by semantic meaning. The company provides a Search API, Contents API, Answer API, Find Similar API, and a product called Websets, all targeted at developers building AI agents, retrieval-augmented generation (RAG) pipelines, and research tools. Headquartered in San Francisco, California, Exa was co-founded in 2021 by Will Bryk and Jeff Wang. As of September 2025, the company has raised over $111 million in total funding at a valuation of $700 million.
Will Bryk and Jeff Wang met as roommates at Harvard University, where Bryk studied computer science and physics and Wang studied computer science and philosophy. While at Harvard, Wang ran a GPU cluster in his dorm room. The two first attempted to build a search engine using crowdsourcing to compete with Google, but later recognized that large language models and neural networks enabled a fundamentally more powerful approach.
After graduation, Bryk worked as a software engineer at Cresta, a machine learning startup, while Wang spent three years building data and web infrastructure at Plaid. When GPT-3 launched in 2020, the pair had a key insight: pretraining for large language models and indexing for search engines are remarkably similar processes, as both involve code looking at all the text on the internet and compressing it into a better representation. This realization led them to found the company in 2021 under the name Metaphor Systems.
Metaphor was accepted into Y Combinator's Winter 2022 batch (YC W22). The company launched its first search engine in November 2022, coinciding with the release of ChatGPT, which accelerated demand from AI developers seeking programmatic access to web search.
In January 2024, the company rebranded from Metaphor to Exa. The new name was chosen to better reflect the company's mission to "organize the world's knowledge." Alongside the rebrand, Exa unveiled several new features, including the highlights capability, which allows users to extract relevant excerpts from webpages using a customizable embedding model. The company chose the .ai domain (exa.ai) to signal its belief that AI systems would become the dominant consumers of search engines in the future.
Exa's core technology is built on embeddings-based retrieval rather than traditional keyword matching. While conventional search engines like Google preprocess documents into keyword indexes and use algorithms like PageRank to rank results, Exa trains specialized transformer models to convert each webpage into a numerical vector (an embedding) that captures the semantic meaning of the content. These embeddings encode deep information about a document, including its arguments, writing style, topics, and even keywords, in a way that is more expressive than text-based indexing.
When a user submits a query, Exa converts the query into an embedding using the same model and then performs a nearest-neighbor search in the embedding space to find the most semantically relevant documents. This approach is sometimes described as "next-link prediction," because the model is trained to predict which web links are most relevant based on semantic meaning rather than direct word matches. The result is that Exa can handle complex, natural-language queries, including entire paragraphs or documents, and return results that match the intent of the query even when the relevant pages do not contain the exact words used.
Exa describes itself as "the first web-scale neural search engine" that uses end-to-end transformer technology (the same class of architecture behind GPT and ChatGPT) to filter by meaning rather than keywords.
Exa operates its own web crawling infrastructure separate from any third-party search index. The crawling system continuously discovers new URLs, crawls them across a distributed network of machines and IP addresses, runs each document through a custom-built HTML parser, and stores the results in Amazon S3. As of 2025, Exa crawls and parses tens of billions of webpages and refreshes its index every minute.
The company prioritizes indexing high-quality content rather than attempting to index the entire web. After gathering billions of documents, Exa preprocesses each document for retrieval by running it through its transformer-based embedding model. This preprocessing step converts each page into one or more embeddings that are stored in Exa's custom-built vector database.
Because no existing vector database met Exa's requirements for cost, throughput, and variable compute per query, the company built its own from scratch. The system is optimized for serving many queries per second at low cost while allowing variable amounts of compute to be allocated per query (for example, spending more compute on harder searches). The engineering team optimized the database at every level, from high-level clustering algorithms and lexical compression techniques down to low-level assembly operations. Key components of the vector database were implemented in Rust for performance.
Exa operates a dedicated GPU cluster called the Exacluster, which powers both model training and inference. The cluster consists of 144 NVIDIA H200 GPUs spread across 18 8-way GPU servers. Its hardware specifications include:
| Component | Specification |
|---|---|
| GPUs | 144 NVIDIA H200 |
| GPU Memory | 20 TB HBM3E (141 GB per GPU) |
| Compute Performance | ~570 PetaTOPS combined |
| CPU Cores | 3,456 cores (192 x 96-core processors) |
| System Memory | 36 TB DDR5 |
| Storage | 270 TB NVMe SSD |
The Exacluster was one of the first clusters in the industry built on NVIDIA's H200 Hopper GPUs. The company used the cluster to pretrain and fine-tune its embedding model over a one-month training run, incorporating embedding techniques the team developed over the preceding six months. With the proceeds from its Series B funding round, Exa plans to expand the Exacluster by 5x.
Exa's technical philosophy draws on Rich Sutton's "The Bitter Lesson," which argues that methods leveraging increased computation tend to outperform hand-engineered approaches over time. Unlike Google's keyword-based retrieval system, which cannot fundamentally improve with additional compute, Exa's embedding-based approach scales with computational resources: training larger models on more data with more GPUs directly improves retrieval quality. This is a core reason the company invests heavily in GPU infrastructure.
The Search API is Exa's primary product. It accepts a natural-language query and returns a ranked list of relevant web results. The API supports several search types:
| Search Type | Description | Latency (P50) |
|---|---|---|
| Auto (default) | Intelligently combines neural and other search methods for optimal results | Varies |
| Neural | Pure embeddings-based semantic search | ~1.2 seconds |
| Fast | Streamlined search models optimized for speed | Sub-350 ms |
| Deep | Agentic search that iteratively retrieves and processes results for maximum quality | ~3.5 seconds |
| Deep Reasoning | Base deep search with reasoning capabilities | Varies |
| Instant | Lowest latency search optimized for real-time applications | Sub-200 ms |
Exa Fast is described as the fastest search API available, achieving sub-350 ms end-to-end P50 latency, which the company claims is 30% faster than the next fastest competitor. Exa Deep, on the other end of the spectrum, agentically searches, processes, and re-searches until it finds the highest quality information, making it suitable for research-intensive applications where latency is less critical.
The Search API also supports advanced filters for domain, date range, semantic category (such as "company," "research paper," or "news"), and content type. Developers can pass very long queries, including entire paragraphs, to find semantically similar content.
The Contents API retrieves the full, cleaned HTML or text content of webpages returned by the Search API. It is useful for applications that need to read and process the actual content of search results, such as RAG pipelines that feed retrieved documents to a large language model.
A highlights feature extracts the most relevant excerpts from pages for a given query. Highlights use a paragraph prediction model to chunk and embed full webpages, then return the most semantically relevant sections. This feature can reduce token budgets and LLM costs by over 50% compared to passing entire pages to a model.
As of March 2026, contents for 10 search results per request are included for free with every Search API call.
The Find Similar API accepts a URL and returns a list of pages that are semantically similar to the provided page. This is useful for content discovery, competitive analysis, and recommendation systems. Because Exa's search is based on embeddings, the similarity comparison captures meaning and topic rather than surface-level keyword overlap.
The Answer API combines Exa's search capabilities with large language model generation. Given a question, it performs an Exa search, retrieves relevant content, and uses an LLM to generate either a direct answer (for factual queries) or a detailed summary with citations (for open-ended queries). The endpoint supports streaming, returning tokens as they are generated.
Launched in June 2025, Exa Research is an agentic search product that automates iterative querying, reading, clustering, and summarization. Unlike the standard Search API, which returns a list of relevant documents, Exa Research outputs structured insight summaries clustered by topic or theme, often with quote extraction and citation traceability. The Research endpoint achieves 94.9% accuracy on the SimpleQA benchmark.
Websets is Exa's high-compute search product designed for exhaustive, long-horizon information retrieval. Users describe what they are looking for in plain English, and Websets agentically sources results and enriches them with custom data columns. Key features include:
Websets is particularly suited for use cases such as lead generation, market mapping, curated dataset creation, and enterprise research workflows.
Exa maintains several specialized search indexes beyond its general web index:
| Index | Coverage | Update Frequency |
|---|---|---|
| People Search | 1B+ LinkedIn profiles | 50M+ updates per week |
| Company Search | 70M+ companies | Continuous |
| Code Search | Billions of GitHub repos, docs, Stack Overflow posts | Continuous |
The People Search index allows queries like "VP of Product at Microsoft" or "enterprise sales reps from Microsoft in EMEA," returning results that can be programmatically enriched with profile data for sales, recruiting, and market research workflows. The Company Search index provides structured attributes including industry, geography, employee count, and funding data.
Exa provides official SDKs for Python and JavaScript/TypeScript:
| Language | Package | Install Command |
|---|---|---|
| Python | exa-py | pip install exa-py |
| JavaScript/TypeScript | exa-js | npm install exa-js |
Both SDKs provide access to all Exa endpoints (search, find_similar, get_contents, answer, and streaming_answer) with full type hints and async support. The Python SDK includes type hints, and the JavaScript SDK includes TypeScript type definitions.
Exa provides a Model Context Protocol (MCP) server that connects AI assistants to Exa's search capabilities. The MCP server supports Claude Desktop, Cursor, VS Code, and over 10 other AI assistants. Available tools through the MCP server include:
web_search_exa for general web searchget_code_context_exa for code-related searchescrawling_exa for web crawlingcompany_research_exa for company researchlinkedin_search_exa for people searchdeep_researcher_start and deep_researcher_check for agentic deep researchThe MCP server can be installed globally with npm install -g exa-mcp-server or used directly with npx mcp-remote https://mcp.exa.ai/mcp.
Developers obtain an API key from the Exa Dashboard and pass it as a bearer token with each request. The API supports both REST endpoints and the official SDKs.
Exa offers a free tier and several paid plans. The pricing structure was simplified in March 2026.
| Item | Price |
|---|---|
| Search (includes 10 results with contents) | $7 per 1,000 requests |
| Additional results beyond 10 | $1 per 1,000 results |
| Summaries | $1 per 1,000 summaries |
| Exa Deep | 20% discount from prior pricing |
| Plan | Monthly Price | Credits | Key Features |
|---|---|---|---|
| Free | $0 | $10 in credits (no expiration) | No credit card required, approximately 2,000 searches |
| Starter | $49 | 8,000 credits | 1 seat, up to 100 results per Webset, 10 enrichment columns, 2 concurrent searches |
| Pro | $449 | 100,000 credits | 10 seats, up to 1,000 results per Webset, 50 enrichment columns, 10 concurrent searches |
| Enterprise | Custom | Custom | Unlimited results, seats, and enrichment columns; volume discounts; enterprise-grade security and support |
Exa does not serve ads. All revenue comes from API sales and Websets subscriptions, which the company says ensures the search engine is optimized for quality and relevance rather than advertising metrics.
One of Exa's primary use cases is serving as the retrieval layer in RAG pipelines. In a typical RAG architecture, a large language model receives relevant documents retrieved from an external source before generating a response. Exa's semantic search provides higher-quality, more relevant source documents compared to keyword-based retrieval, which reduces hallucination risks by ensuring models work from verified sources rather than parametric memory alone.
Exa is widely used as a search tool within AI agent frameworks. When an AI agent needs to look up information from the web as part of a multi-step reasoning process, it can call the Exa Search API to get semantically relevant results. Exa's low-latency Fast endpoint is particularly suited for agentic workflows where the agent may need to make many search calls in rapid succession.
Using Exa's Company Search index and Websets product, users can obtain structured research on organizations, including business models, competitors, recent funding, and key personnel. Private equity firms, consulting companies, and venture capital firms use Exa for deal sourcing and market mapping.
The People Search index and Websets enrichment capabilities support sales prospecting workflows. Users can search for professionals by role, company, geography, and skills, then enrich results with verified emails and company details for outbound outreach.
Developers use Exa's Search API with date filters to retrieve recent news articles on specific topics. Combined with the Answer API or an external LLM, this enables automated news monitoring and summarization applications.
Exa's semantic search is useful for finding research papers and academic sources. The ability to search by meaning rather than exact keywords helps researchers discover relevant papers that use different terminology for similar concepts.
Exa serves thousands of companies and developers. Notable customers include:
| Customer | Use Case |
|---|---|
| Cursor | Code search across millions of GitHub repos, docs, and Stack Overflow for AI-powered code editing |
| Databricks | Research and information retrieval |
| Amazon Web Services | Enterprise search integration |
| Vercel | Developer tooling |
| Notion | AI agents using Exa's web index |
| Point72 | Financial data research using Exa's index of 70M+ companies |
| HubSpot | Monitoring updates across 1B+ people and companies |
Exa has raised a total of over $111 million across three funding rounds.
| Date | Round | Amount | Valuation | Lead Investor | Other Investors |
|---|---|---|---|---|---|
| 2022 | Seed | ~$5M | Undisclosed | Undisclosed | Y Combinator |
| July 2024 | Series A | $17M ($22M cumulative) | Undisclosed | Lightspeed Venture Partners | NVentures (NVIDIA), Y Combinator |
| September 2025 | Series B | $85M ($111M cumulative) | $700M | Benchmark | Lightspeed, Y Combinator, NVentures (NVIDIA) |
Peter Fenton of Benchmark joined Exa's board following the Series B round. The company's Series A was notable for including NVentures, NVIDIA's venture capital arm, as an investor, reflecting the strategic importance of GPU infrastructure to Exa's business.
Exa is advised by researchers from OpenAI, Google, and Bing.
| Name | Role | Background |
|---|---|---|
| Will Bryk | CEO and Co-founder | Studied CS and physics at Harvard. Former software engineer at Cresta (ML startup). Grew up in New York City. |
| Jeff Wang | Co-founder | Studied CS and philosophy at Harvard. Spent three years building data and web infrastructure at Plaid. Ran a GPU cluster in his Harvard dorm room. |
As of 2025, Exa has approximately 30 or more employees across engineering, product, and operations, all based in San Francisco.
Exa operates in the growing market for AI-native search APIs. Its competitors include both other AI search startups and traditional search API providers.
| Competitor | Approach | Strengths | Pricing |
|---|---|---|---|
| Tavily | AI-ranked search aggregation | Quick answer synthesis, transparent RAG pricing | ~$0.008 per request |
| SerpAPI | Wrapper around Google and 20+ search engines | Broadest engine coverage, structured SERP data | ~$0.015 per search |
| Perplexity Sonar API | LLM-generated answers with citations | End-to-end answer generation with sources | $1-$15 per 1M tokens |
| Brave Search API | Independent index with privacy focus | 30B+ page index, no tracking, low cost | $3 per 1,000 queries |
| Google Custom Search API | Google's search index | Largest web index | $5 per 1,000 queries (first 100/day free) |
| You.com | Multi-modal AI search | Image search, privacy features | Varies |
| Firecrawl | Web scraping and content extraction | Deep extraction, LangChain/LlamaIndex adapters | Varies |
Exa differentiates itself from these competitors in several ways. Unlike SerpAPI and Google Custom Search, which are wrappers around existing keyword-based search engines, Exa runs its own neural search index. Unlike Perplexity's Sonar API, which returns generated answers, Exa returns raw search results and webpage content, giving developers more control over how results are processed. Unlike Tavily, which aggregates content from multiple sources, Exa performs semantic search over its own proprietary index.
Analysts expect the AI search API market to consolidate around three to five major players, with Exa, Tavily, and Perplexity among the likely survivors.
In mid-2025, Exa announced version 2.0 of its search endpoints, representing a major upgrade to the platform's capabilities. The three headline features were Exa Fast, Exa Auto, and Exa Deep.
To build Exa 2.0, the company significantly expanded its web index to tens of billions of pages with minute-level refresh cycles. The embedding model was retrained from scratch on the 144x H200 Exacluster over a one-month period. The vector database received upgrades in Rust, including new clustering algorithms, lexical compression, and assembly-level optimizations.
Exa 2.0 demonstrated strong performance on multiple benchmarks, including SimpleQA and Frames. The company evaluates its search APIs within a RAG framework using GPT-4 as the RAG model and GPT-4o-mini for grading, ensuring consistent evaluation methodology across tests.