Voyage-3
Last reviewed
May 16, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 3,473 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 16, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 3,473 words
Add missing citations, update stale details, or suggest a clearer explanation.
Voyage-3 is a family of general-purpose text embedding models developed by Voyage AI, launched in September 2024 with voyage-3 and voyage-3-lite, expanded in January 2025 with voyage-3-large, and refreshed in May 2025 with voyage-3.5 and voyage-3.5-lite. All members share a dense-vector retrieval interface, a 32,000-token context window, and a 1024-dimensional default output, while differing in cost, peak quality, and the dimensions to which their vectors can be truncated. The line is widely cited for strong scores on the Massive Text Embedding Benchmark (MTEB) and on Voyage AI's own retrieval evaluations across legal, financial, code, technical-documentation, multilingual, and long-document domains.
Voyage-3 was the first generation in which Voyage AI shipped Matryoshka representation learning and quantization-aware training as default features. With voyage-3-large and voyage-3.5, a single API call can return embeddings of 256, 512, 1024, or 2048 dimensions in float32, int8, uint8, or binary precision without re-encoding. The family is closely associated with Anthropic, which since 2024 has documented Voyage AI as the recommended embedding provider for Claude-based retrieval-augmented generation pipelines. Voyage AI was not acquired by Anthropic; the company was acquired by MongoDB on 24 February 2025 and now operates as Voyage AI by MongoDB.
Voyage AI was founded in 2023 by Stanford computer-science professor Tengyu Ma with co-founders Hong Liu and Kaidi Cao. The company's early product was the voyage-01 and voyage-lite-01 embedding models released in late 2023, followed by domain-specialized checkpoints such as voyage-law-2 (April 2024) and voyage-finance-2 (May 2024). The voyage-2 line was a moderate iteration on the original architecture, but it was the voyage-3 release in September 2024 that established the company as a peer of OpenAI and Cohere in independent benchmarks.
The "3" in the name refers to the third major generation of Voyage's general-purpose family rather than to any specific architecture, parameter count, or training recipe. The naming convention places quality tiers after the version number: the base name (voyage-3, voyage-3.5) denotes the standard quality model, the suffix -lite denotes a smaller checkpoint trained partly via distillation, and the suffix -large denotes the highest-quality flagship. The minor version step from 3 to 3.5 reflects a refreshed training mixture and distillation from voyage-3-large rather than a new core architecture.
Voyage AI announced voyage-3 and voyage-3-lite on 18 September 2024, positioning them as drop-in replacements for OpenAI's text-embedding-3-large with much smaller default embedding dimensions and a four-times-longer context window. Voyage AI reported that voyage-3 outperformed text-embedding-3-large by an average of 7.55 percent on a suite of 40 retrieval datasets covering eight domains, while costing 2.2 times less per token and producing vectors three times smaller than the OpenAI default of 3072 dimensions. The lite variant was reported as 3.82 percent more accurate than text-embedding-3-large on the same datasets while costing 6.5 times less, with vectors six times smaller.
Both models were trained with an improved transformer encoder architecture, distillation from larger teacher checkpoints, more than two trillion tokens of pre-training data, and retrieval-result alignment via human feedback. Both accept up to 32,000 tokens of input. The 1024-dimensional output of voyage-3 and the 512-dimensional output of voyage-3-lite were fixed at launch; Matryoshka truncation arrived with the later voyage-3-large model.
Voyage-3-large launched on 7 January 2025 as the new state-of-the-art flagship in the family. It was the first Voyage model to ship Matryoshka and quantization-aware training as default features, allowing a single response at 256, 512, 1024, or 2048 dimensions in float32, int8, uint8, binary, or unsigned-binary precision. Voyage AI reported that voyage-3-large outperformed OpenAI text-embedding-3-large by 9.74 percent on average and Cohere Embed v3 English by 20.71 percent across 100 datasets in eight domains, and led the public MTEB leaderboard with an overall score of 65.1, edging out NV-Embed v2 by 0.3 points and OpenAI text-embedding-3-large by 0.5 points.
The model remained competitive after aggressive compression. At 1024 dimensions it was 10.58 percent more accurate than OpenAI text-embedding-3-large at 3072 dimensions, and with binary embeddings at one two-hundredth of the storage cost it remained 1.16 percent above OpenAI's float32 baseline. Voyage-3-large was also reported to surpass voyage-law-2 on legal retrieval and voyage-finance-2 on financial retrieval, although the specialized voyage-code-3 remained the preferred model for source-code retrieval.
Voyage AI released voyage-3.5 and voyage-3.5-lite on 20 May 2025. The 3.5 step did not introduce a new flagship; instead it lifted the quality of the mid-tier and entry-tier models while leaving their per-token prices identical to voyage-3 and voyage-3-lite. Voyage AI reported that voyage-3.5 was 8.26 percent more accurate than OpenAI text-embedding-3-large, 2.66 percent more accurate than voyage-3, and 1.63 percent more accurate than Cohere Embed v4 on average across the same 100 datasets used to evaluate voyage-3-large. The lite variant was reported as 6.34 percent better than OpenAI text-embedding-3-large and 4.28 percent better than voyage-3-lite.
Both voyage-3.5 models adopted the same Matryoshka and quantization toolkit introduced with voyage-3-large, supporting 256, 512, 1024, and 2048 dimensions at float32, int8, uint8, and binary precision. Voyage AI attributed the quality gains to better training data, distillation from voyage-3-large, and tighter integration with its rerankers. The company calculated that a 512-dimensional int8 voyage-3.5 embedding cost roughly 83 percent less to store and search than a 3072-dimensional float32 OpenAI text-embedding-3-large embedding at equivalent retrieval accuracy.
All models in the voyage-3 family accept up to 32,000 tokens of input per request and share a single REST endpoint at api.voyageai.com. The default output dimension of every model is 1024; only voyage-3-large, voyage-3.5, and voyage-3.5-lite expose additional Matryoshka dimensions and quantized precisions. Voyage-3-lite predates the Matryoshka toolkit and is a fixed 512-dimensional float32 model. Pricing below is for the public API as listed on Voyage AI's documentation; the first 200 million tokens per account are free and shared across the family.
| Model | Released | Default dim | Other dims (Matryoshka) | Context | Price per 1M tokens |
|---|---|---|---|---|---|
| voyage-3-lite | Sep 2024 | 512 | fixed | 32,000 | $0.02 |
| voyage-3 | Sep 2024 | 1024 | fixed | 32,000 | $0.06 |
| voyage-3-large | Jan 2025 | 1024 | 256, 512, 2048 | 32,000 | $0.18 |
| voyage-3.5-lite | May 2025 | 1024 | 256, 512, 2048 | 32,000 | $0.02 |
| voyage-3.5 | May 2025 | 1024 | 256, 512, 2048 | 32,000 | $0.06 |
With Matryoshka truncation enabled, the same call can yield, for example, a 256-dimensional int8 vector that occupies 256 bytes instead of the 4096 bytes of a 1024-dimensional float32 vector, a 16-fold reduction in vector-database storage. Binary embeddings, in which each dimension is reduced to a single bit, compress further to roughly one two-hundredth of the float32 baseline. Voyage AI's reported retrieval scores for these compressed embeddings remain competitive with full-precision OpenAI baselines, although accuracy still drops modestly relative to full-precision Voyage embeddings of the same dimension.
Matryoshka representation learning, introduced by Aditya Kusupati and colleagues in 2022, trains a single embedding model with a sum of contrastive losses computed over nested prefixes of the output vector. The result is an embedding in which the first k coordinates are themselves a self-contained representation for many values of k, so the vector can be truncated to a smaller dimension without retraining and with only a modest loss in retrieval quality. OpenAI's text-embedding-3 series, released in January 2024, was the first widely deployed commercial API to advertise Matryoshka-style truncation; Voyage AI added the technique to its general-purpose family with voyage-3-large.
Voyage AI's implementation differs from OpenAI's in two practical ways. First, voyage-3-large and the voyage-3.5 models expose a discrete set of supported dimensions (256, 512, 1024, 2048) rather than allowing arbitrary truncation, which lets the API normalize the truncated vector to unit length server-side. Second, Voyage pairs Matryoshka training with quantization-aware training so that the truncated vectors remain accurate not just at float32 but also at int8 and binary precision. The combination enables the headline storage reductions Voyage AI quotes against OpenAI at the same retrieval quality. For application developers, Matryoshka support means a single index can serve multiple query latency profiles: a retrieval pipeline might store 2048-dimensional float32 vectors for offline analytics and re-ranking while serving an online vector search at 256 or 512 int8 dimensions, all from the same API call.
Voyage-3-large was the first model in the voyage-3 family to take the top position on the Hugging Face MTEB English leaderboard, which aggregates 56 retrieval, classification, clustering, reranking, summarization, and semantic-similarity tasks. Voyage AI reported the following snapshot of the leaderboard at the launch of voyage-3-large in January 2025.
| Model | MTEB average | Provider | Default dim |
|---|---|---|---|
| voyage-3-large | 65.1 | Voyage AI | 1024 |
| NV-Embed v2 | 64.8 | NVIDIA | 4096 |
| OpenAI text-embedding-3-large | 64.6 | OpenAI | 3072 |
| stella_en_1.5B_v5 | 64.4 | Independent | 1024 |
| Cohere Embed v3 English | 64.5 | Cohere | 1024 |
MTEB is a general-purpose benchmark, and the gap between the top entries is small. Voyage AI's marketing has therefore emphasized retrieval-specific evaluations on internal datasets covering eight domains, where the company reports larger leads.
Voyage AI publishes results for its models on a private suite of 100 retrieval datasets grouped into eight domains: technical documentation, code, law, finance, web reviews, multilingual content, long documents, and conversations. The headline numbers, averaged across all eight domains, are summarized below as the percentage uplift over OpenAI text-embedding-3-large (the reference baseline at each launch date).
| Model | Avg uplift over OpenAI v3-large | Notes |
|---|---|---|
| voyage-3 | +7.55% | Reported Sep 2024 on 40-dataset subset |
| voyage-3-lite | +3.82% | Reported Sep 2024 on 40-dataset subset |
| voyage-3-large | +9.74% | Reported Jan 2025 on 100-dataset suite; also +20.71% vs Cohere Embed v3 English |
| voyage-3.5 | +8.26% | Reported May 2025; also +2.66% vs voyage-3 and +1.63% vs Cohere Embed v4 |
| voyage-3.5-lite | +6.34% | Reported May 2025; also +4.28% vs voyage-3-lite |
The evaluation is not externally audited, but Voyage AI has released the dataset list and evaluation scripts on GitHub, which has allowed several independent researchers to publish broadly consistent results. Voyage-3-large is best-in-class on every domain except code, where the specialized voyage-code-3 leads.
The voyage-3 family coexists with several domain-specialized checkpoints that share Voyage AI's encoder lineage but are fine-tuned for narrower use cases. Voyage-code-3 adopts the same Matryoshka and quantization toolkit as the voyage-3-large generation, while the older voyage-law-2 and voyage-finance-2 do not.
| Model | Released | Domain | Context | Dimensions | Notes |
|---|---|---|---|---|---|
| voyage-law-2 | Apr 2024 | Legal text | 16,000 | 1024 | Superseded by voyage-3-large for many uses |
| voyage-finance-2 | May 2024 | Financial filings | 32,000 | 1024 | Pre-Matryoshka; tuned on SEC and earnings data |
| voyage-code-3 | Dec 2024 | Source code | 32,000 | 256, 512, 1024, 2048 | Highest scores on code retrieval; Matryoshka-enabled |
| voyage-multimodal-3 | Nov 2024 | Text and images | 32,000 | 1024 | Single embedding space for text plus image inputs |
With voyage-3-large surpassing voyage-law-2 and voyage-finance-2 on most legal and financial retrieval tasks, Voyage AI's practical recommendation as of 2025 is to default to a general-purpose voyage-3 model and fall back to a domain model only when domain-specific recall is critical. Voyage-code-3 remains the recommended model for source-code retrieval and IDE indexing.
The table below compares the voyage-3 family with the major commercial embedding APIs available in 2025, focusing on default dimension, context window, and the headline price for the standard-quality tier. The price column lists U.S. dollars per million input tokens at the time of each model's most recent pricing update.
| Provider | Model | Default dim | Matryoshka dims | Context | Price per 1M tokens |
|---|---|---|---|---|---|
| Voyage AI | voyage-3-large | 1024 | 256, 512, 1024, 2048 | 32,000 | $0.18 |
| Voyage AI | voyage-3.5 | 1024 | 256, 512, 1024, 2048 | 32,000 | $0.06 |
| OpenAI | text-embedding-3-large | 3072 | arbitrary truncation | 8,192 | $0.13 |
| OpenAI | text-embedding-3-small | 1536 | arbitrary truncation | 8,192 | $0.02 |
| Cohere | Cohere Embed v3 English | 1024 | no | 512 | $0.10 |
| Cohere | Embed v4 | 1536 | 256, 512, 1024, 1536 | 128,000 | $0.12 |
| Jina AI | Jina Embeddings v3 | 1024 | 32 to 1024 | 8,192 | $0.02 |
| Nomic | Nomic Embed text v1.5 | 768 | 64, 128, 256, 512, 768 | 8,192 | open weights |
| gemini-embedding-001 | 3072 | 768, 1536, 3072 | 2,048 | $0.0001 per request |
The voyage-3 family stands out for combining a 32,000-token context window with a small default vector size and Matryoshka truncation down to 256 dimensions. OpenAI text-embedding-3-large supports a larger output dimension and arbitrary truncation but uses a shorter 8,192-token context window and produces a much larger default vector. Cohere Embed v4 introduced Matryoshka and a much longer context but at a higher price than voyage-3.5. Open-weight models such as Nomic Embed v1.5 and Jina Embeddings v3 compete primarily on price and licensing flexibility rather than peak retrieval accuracy.
When Anthropic added an Embeddings section to its Claude developer documentation in 2024, it stated that Anthropic does not train its own embedding model and that Voyage AI is its recommended embedding provider for Claude-based information retrieval and retrieval-augmented generation. The Anthropic Cookbook on GitHub includes a third-party Voyage AI notebook demonstrating how to chunk documents, embed them with voyage-3 or voyage-3-large, store the vectors in a vector database, and retrieve the top results before passing them to Claude for generation.
The relationship between Anthropic and Voyage AI is a partnership and a recommendation, not a corporate combination. Anthropic has never acquired Voyage AI. The acquisition that did occur was by MongoDB, announced on 24 February 2025, and Voyage AI now operates as Voyage AI by MongoDB. The companies remained closely associated commercially even after the acquisition, with Anthropic's recommendation language for Voyage embeddings preserved in the Claude documentation and the Voyage AI free tier of 200 million tokens explicitly aimed at Claude developers prototyping retrieval pipelines.
Anthropic has consistently described its Claude family of models as best paired with strong external retrieval. Inside the Claude documentation, Voyage AI is named in the same paragraph as Anthropic's own contextual retrieval research. Anthropic's September 2024 paper "Introducing Contextual Retrieval" used voyage-3 embeddings together with a BM25 sparse retriever and a contextual chunk-enrichment step to improve retrieval precision on the Anthropic open-source evaluation set by roughly 49 percent.
Voyage AI does not publish parameter counts or full architectural details for its production embedding models. The company has stated that the voyage-3 family uses an encoder-style transformer with rotary position embeddings, a sequence length of 32,000 tokens, and a final mean-pooling operation followed by a learned linear projection that produces the default 1024-dimensional output. The Matryoshka projection head used in voyage-3-large and the voyage-3.5 models adds a structured loss over nested prefixes of the output vector at training time, with quantization-aware training mixed in via straight-through estimators for int8 and binary precision.
The training mixture comprises more than two trillion tokens of multilingual web text, code, mathematics, scientific publications, legal documents, financial filings, and conversational data, augmented with retrieval pairs mined from open datasets such as MS MARCO and from proprietary sources. Voyage AI applies retrieval-result alignment via human feedback, in which annotators rank candidate retrievals against queries and the model is updated to prefer the higher-ranked candidates. The voyage-3.5 step adds distillation from voyage-3-large to lift the smaller checkpoints without increasing inference cost. All models in the family are trained with contrastive learning over query-document pairs, combining an in-batch contrastive loss, a triplet loss with hard negatives mined from a retriever-in-the-loop process, and the Matryoshka prefix loss.
Voyage AI provides the voyage-3 family through its REST API at api.voyageai.com, with Python and JavaScript SDKs. The same models are distributed through several cloud marketplaces: AWS Marketplace as inference endpoints, Microsoft Azure AI Foundry, and Vercel AI Gateway. Selected checkpoints, including the voyage-3-m-exp experimental research release, are published on Hugging Face under a non-commercial license, but the production-grade models are API-only. After the MongoDB acquisition, the models were integrated into MongoDB Atlas Vector Search through a managed embedding pipeline that automatically converts new documents into voyage-3 or voyage-3.5 embeddings during ingest, billed at the same per-token rate as the public API with the 200-million-token free tier applying.
Reviewers consistently highlighted three properties of the voyage-3 family: the small default 1024-dimensional vector size, the long 32,000-token context window, and the Matryoshka and quantization toolkit. Independent comparisons by vector database vendors, including Pinecone and Weaviate, reproduced Voyage AI's claimed lead over text-embedding-3-large on retrieval benchmarks while noting that the absolute gap on general English MTEB tasks was small. At the systems-engineering level, the family's 256-dimensional binary embeddings drew particular attention because they enabled fast in-memory nearest-neighbor search on billion-scale corpora with hardware footprints previously associated with classical TF-IDF or BM25 indexes.
In the developer community, the family's adoption by Anthropic for Claude-based retrieval-augmented generation, alongside its inclusion in the LangChain and LlamaIndex ecosystems, made voyage-3 a common default for new RAG projects in 2025. The free 200-million-token tier reduced the friction of switching from OpenAI for prototypes, and the MongoDB Atlas integration after the February 2025 acquisition simplified production deployment for users already on the MongoDB stack. By the end of 2025, voyage-3-large and voyage-3.5 were among the most frequently cited commercial embedding models in published RAG research, alongside OpenAI text-embedding-3-large and the open-weight Nomic and Jina families.
The voyage-3 family has attracted several specific criticisms. Voyage AI does not publish parameter counts, training data, or full evaluation scripts, which limits the ability of external researchers to reproduce or audit its claims. The company's domain-evaluation suite of 100 datasets, on which the headline lead over OpenAI and Cohere is reported, is not fully open, and several researchers have argued that the dataset selection favors the kinds of documents Voyage AI fine-tunes on. The MTEB leaderboard, where the absolute lead is much smaller, provides a more neutral comparison.
The API is also proprietary. Unlike Nomic Embed and Jina Embeddings v3, the voyage-3 family is not released as open weights for the production checkpoints, so users cannot self-host or fine-tune the models on their own hardware. The voyage-3-m-exp release on Hugging Face is an experimental research checkpoint with a non-commercial license and is not the same model served by the API. Customers who require on-premises inference have therefore typically continued to use open-weight alternatives. Finally, with the release of the voyage-4 family in early 2026, the voyage-3 family has moved into a maintenance role; Voyage AI continues to serve and document it but recommends voyage-4 for new projects with the highest quality requirements.