Weaviate is an open-source vector database that stores both data objects and their vector embeddings, enabling a combination of vector similarity search with structured filtering. Written in Go, the system is designed for speed and reliability in artificial intelligence applications. Weaviate was founded in 2019 by Bob van Luijt and Etienne Dilocker in the Netherlands, and the project has grown into one of the most widely adopted open-source vector databases, with over 14,000 GitHub stars and more than 2,000 companies running it in production [1].
Bob van Luijt, a Dutch technology entrepreneur, co-founded Weaviate with Etienne Dilocker (CTO and systems architect) in 2019. Van Luijt started his first software company at age 15 in the Netherlands. He later studied music at ArtEZ University of the Arts and Berklee College of Music, and completed the Harvard Business School Program of Management Excellence [2].
The project began as an open-source initiative and gained traction as the AI community started adopting large language models that produce vector embeddings. By 2023, the open-source project had surpassed 2 million downloads, and the team had built Weaviate Cloud as a managed service alongside the self-hosted option [3].
Weaviate has raised over $67 million across multiple funding rounds [4].
| Round | Date | Amount | Lead Investor |
|---|---|---|---|
| Series A | Early 2022 | ~$17M | NEA |
| Series B | April 2023 | $50M | Index Ventures |
The Series B round included participation from Battery Ventures, NEA, Cortical Ventures, Zetta Venture Partners, and ING Ventures. The funding was intended to meet growing demand for AI-native vector database technology, expand the team, and develop the Weaviate Cloud platform [3].
Weaviate's architecture is designed to handle multiple search paradigms (vector, keyword, and hybrid) within a single system.
Weaviate uses HNSW (Hierarchical Navigable Small World) as its primary vector index algorithm. HNSW is a graph-based approach that searches vectors by navigating through multiple layers, moving from coarse approximations to fine ones. Its complexity grows logarithmically rather than linearly with the dataset size, which makes it effective even at billions of vectors [5].
Each named vector in Weaviate maintains its own vector index, which can be either an HNSW graph or a flat index. The HNSW graph is held in memory for fast access, with commit logs and snapshots used to restore the structure after a restart [5].
In single-instance mode, Weaviate's in-memory HNSW index handles millions of vectors with sub-second response times. For larger datasets, cluster mode distributes data across nodes with sharding and replication, enabling horizontal scaling to billions of data points [5].
Weaviate exposes several HNSW tuning parameters that allow developers to optimize the tradeoff between recall, speed, and memory:
| Parameter | Default | Description |
|---|---|---|
| efConstruction | 128 | Size of dynamic candidate list during index construction; higher values improve graph quality at the cost of slower builds |
| maxConnections | 64 | Maximum number of connections per node in the HNSW graph; higher values improve recall but increase memory usage |
| ef | -1 (dynamic) | Size of the candidate list during search; controls the recall-latency tradeoff at query time |
| vectorCacheMaxObjects | 1 trillion | Maximum number of vectors to cache in memory; reducing this value helps when the dataset exceeds available RAM |
| flatSearchCutoff | 40,000 | Collections smaller than this threshold use brute-force (flat) search instead of HNSW |
The flat index option is useful for small collections or multi-tenant environments where each tenant has a relatively small dataset. Instead of building and maintaining an HNSW graph, flat indexing stores vectors in a simple array and performs brute-force search, which is faster for small collections due to lower overhead [5].
Through a partnership with NVIDIA, Weaviate supports GPU-accelerated index building using the cuVS library. The CAGRA algorithm builds indexes on the GPU, then converts them to HNSW format for cost-effective CPU-based query serving. This approach speeds up index construction while keeping query costs low [6].
In addition to the HNSW vector index, Weaviate maintains an inverted index based on BM25F scoring for keyword search. The BM25F algorithm extends the standard BM25 scoring function with field-level weighting, so different properties of an object (such as title versus body text) can contribute differently to keyword relevance scores. This inverted index powers Weaviate's keyword search and is also used as one component of hybrid search [7].
Weaviate's hybrid search combines dense vector similarity with BM25 keyword matching in a single native API call. According to Weaviate's 2025 benchmarks, hybrid search improved NDCG@10 by 42% over pure vector search, which is particularly relevant for retrieval-augmented generation workloads where recall directly affects the quality of generated answers [7].
The system fuses dense vector embeddings with sparse BM25 lexical matching, and developers can control the weighting between the two methods. Independent evaluations report 35-50% relevance improvements from hybrid search compared to either method alone [8].
Weaviate offers two fusion algorithms for combining vector and keyword results:
| Fusion algorithm | Description | Best for |
|---|---|---|
| Ranked Fusion | Combines results by summing the inverse of each result's rank in the individual search lists | General-purpose hybrid queries |
| Relative Score Fusion | Normalizes scores from each search method to [0, 1] range and combines them | When the raw score distributions of the two methods differ significantly |
The alpha parameter controls the balance between vector search (alpha = 1.0) and keyword search (alpha = 0.0). Setting alpha to 0.5 weights both methods equally. In practice, values between 0.7 and 0.8 (favoring vector search slightly) tend to perform well for RAG workloads [7][14].
Weaviate's module system is one of its defining architectural features. Modules extend the database with additional capabilities at various stages of the data pipeline: vectorization (generating embeddings from raw data), generative AI (producing text from search results), reranking (refining result order), and reference resolution (loading data from external sources).
Modules are loaded at server startup via configuration. In Docker deployments, modules are specified as environment variables. In Kubernetes, they are configured in the Helm chart values. Weaviate Cloud instances come with a default set of modules pre-configured [9].
Vectorizer modules automatically generate embeddings when objects are inserted into Weaviate. Instead of computing embeddings externally and uploading them, users configure a vectorizer module on a collection and Weaviate handles embedding generation transparently.
| Module | Provider | Model type | Key notes |
|---|---|---|---|
| text2vec-openai | OpenAI | Text | Uses OpenAI's embedding API (text-embedding-ada-002, text-embedding-3-small, etc.) |
| text2vec-cohere | Cohere | Text | Uses Cohere's multilingual embedding models |
| text2vec-huggingface | Hugging Face | Text | Uses Hugging Face Inference API for hosted models |
| text2vec-transformers | Self-hosted | Text | Runs transformer models locally in a sidecar container |
| text2vec-google | Text | Uses Google's embedding models (Vertex AI, PaLM) | |
| text2vec-aws | AWS | Text | Uses Amazon Bedrock embedding models |
| text2vec-jinaai | Jina AI | Text | Uses Jina AI's embedding models |
| text2vec-voyageai | Voyage AI | Text | Uses Voyage AI's domain-specific embeddings |
| multi2vec-clip | OpenAI CLIP | Multi-modal | Embeds both text and images into a shared vector space |
| multi2vec-bind | ImageBind | Multi-modal | Meta's ImageBind for text, image, audio, and video embeddings |
Users can also import pre-computed vector embeddings directly, bypassing the module system entirely [9].
Generative modules connect Weaviate to large language models for retrieval-augmented generation (RAG) directly within the database.
| Module | Provider | Description |
|---|---|---|
| generative-openai | OpenAI | Uses GPT-4, GPT-3.5-turbo for generation |
| generative-cohere | Cohere | Uses Cohere's Command models |
| generative-aws | AWS | Uses Amazon Bedrock models (Claude, Titan) |
| generative-google | Uses Google's Gemini models | |
| generative-anyscale | Anyscale | Uses open-source models hosted on Anyscale |
A generative search query in Weaviate consists of two parts: a search query (vector, keyword, or hybrid) and a prompt for the language model. Weaviate first retrieves relevant objects, then passes both the search results and the prompt to the configured generative model, returning the generated response alongside the original search results [12].
Reranker modules apply a second-pass ranking model after the initial retrieval step. Cross-encoder reranking models score each retrieved result against the original query, which typically produces more accurate relevance judgments than the initial bi-encoder similarity search. Available reranker modules include reranker-cohere, reranker-voyageai, and reranker-transformers.
Weaviate organizes data into collections (previously called "classes"). Each collection defines a schema that specifies the properties of its objects, the vectorizer to use, the generative module (if any), and index configuration.
A collection definition includes:
Weaviate also supports auto-schema, which is enabled by default. When auto-schema is active, Weaviate infers collection definitions from the data being inserted, automatically detecting property names and types. This is convenient for prototyping but should be disabled in production for predictable behavior [22].
Cross-references allow linking objects across collections. For example, an Article collection might reference objects in an Author collection, enabling graph-like traversals in queries.
Weaviate implements native multi-tenancy with one shard per tenant, dynamic resource management, and true data isolation. This design means each tenant's data is physically separated, not just logically filtered. Recent updates introduced tenant offloading to S3 cloud storage and renamed the tenant activity statuses from HOT/COLD to ACTIVE/INACTIVE for clarity [10].
Multi-tenancy is useful for SaaS applications where each customer needs their own isolated dataset but the application operator does not want to manage separate database instances for each customer.
Weaviate's multi-tenancy implementation supports three tenant states:
| State | Data location | Cost | Access speed |
|---|---|---|---|
| ACTIVE (formerly HOT) | In memory and on local disk | Highest | Fastest |
| INACTIVE (formerly COLD) | On local disk only | Medium | Moderate (requires activation) |
| OFFLOADED | In S3 cloud storage | Lowest | Slowest (requires reactivation to local disk) |
This tiered approach allows operators to manage costs by offloading inactive tenants to cheaper storage while keeping frequently accessed tenants in memory for low-latency queries [10].
Weaviate exposes three APIs: REST, gRPC, and GraphQL. The GraphQL API is particularly distinctive among vector databases, since most competitors offer only REST or gRPC interfaces. GraphQL allows clients to specify exactly which fields they need, reducing over-fetching and making it easier to build flexible query interfaces [11].
A typical Weaviate GraphQL query can combine vector search, keyword filters, and property selection in a single request:
{
Get {
Article(
hybrid: {
query: "machine learning applications"
alpha: 0.75
}
limit: 5
) {
title
content
_additional {
score
distance
}
}
}
}
The alpha parameter controls the balance between vector search (1.0) and keyword search (0.0).
Weaviate includes built-in generative search capabilities, allowing users to perform retrieval-augmented generation directly within the database. After retrieving relevant objects through vector or hybrid search, Weaviate can pass those results to a configured language model (such as OpenAI's GPT-4 or Cohere's models) to generate answers, summaries, or transformations. This eliminates the need for a separate orchestration layer for basic RAG use cases [12].
Generative search supports two modes:
Integrated reranking modules let users apply a second-pass ranking model after the initial retrieval step. Reranking models score each retrieved result against the original query using a cross-encoder, which typically produces more accurate relevance judgments than the initial bi-encoder similarity search.
Weaviate supports multiple named vectors per object, allowing a single object to have separate embeddings for different properties or different embedding models. For example, a product listing might have one vector for its title (embedded with a lightweight model) and another for its description (embedded with a more powerful model). Queries can target specific named vectors, enabling flexible multi-modal and multi-representation search strategies.
Weaviate Cloud is the company's managed hosting service, offering serverless and dedicated cluster options.
| Plan | Starting Price | Description |
|---|---|---|
| Sandbox (Free) | $0 | 14-day trial clusters for experimentation |
| Serverless | Pay-as-you-go | Shared infrastructure, billed by usage |
| Dedicated | From ~$25/month | Isolated clusters with guaranteed resources |
| Enterprise | Custom | SLAs, priority support, custom deployments |
Weaviate restructured its pricing in October 2025, moving from a per-dimension model to three billing dimensions with a $45/month minimum covering baseline cluster costs [13]. Users who prefer self-hosting can deploy Weaviate via Docker, Kubernetes, or Helm charts at no licensing cost, since the core database is open-source under a BSD-3-Clause license.
Weaviate publishes benchmark results for its search modes. In 2025, the team benchmarked hybrid search against pure vector search and pure keyword search across 12 information retrieval datasets from BEIR, LoTTe, and BRIGHT, as well as WixQA and EnronQA.
Key findings from these benchmarks include:
| Search mode | Relative performance | Notes |
|---|---|---|
| Pure vector search | Baseline | Uses dense embeddings only |
| Pure BM25 keyword search | Varies by dataset | Better for exact keyword matching; weaker for semantic queries |
| Hybrid search (BM25 + vector) | +42% NDCG@10 vs pure vector | Combining both methods yields the strongest results across most datasets |
| Search Mode (auto-optimized) | Best overall | Weaviate's automatic search mode selection further improves results [23] |
In terms of raw throughput, optimized Weaviate deployments have been benchmarked at 10,000 to 15,000 queries per second for pure vector search workloads. Latency numbers depend heavily on dataset size, dimensionality, and hardware configuration [14].
Weaviate and Pinecone are frequently compared since both target similar use cases but take different approaches.
| Feature | Weaviate | Pinecone |
|---|---|---|
| Source model | Open-source (BSD-3-Clause) | Closed-source, managed SaaS |
| Self-hosting | Yes (Docker, Kubernetes) | No (cloud-only, or BYOC on Dedicated plan) |
| Hybrid search | Native BM25 + vector in one API call | Requires separate sparse vector index |
| API style | REST, gRPC, GraphQL | REST, gRPC |
| Auto-vectorization | Yes, via modules | Yes, via Pinecone Inference |
| Multi-tenancy | Native, one shard per tenant | Via namespaces |
| Generative search | Built-in RAG modules | Via Pinecone Assistant |
| Primary language | Go | Not disclosed (managed service) |
Pinecone tends to win on setup simplicity and consistent low-latency performance for pure vector search workloads. Weaviate tends to win on flexibility, hybrid search quality, and cost control through self-hosting options. For workloads where hybrid search is central to the application, Weaviate's native BM25 + vector fusion in a single API call is more efficient than Pinecone's approach of maintaining separate sparse and dense indexes [14].
Performance benchmarks from 2025 show Weaviate achieving 10,000-15,000 queries per second with optimized configuration, while Pinecone's Dedicated Read Nodes achieved 5,700 QPS at P99 60ms on 1.4 billion vectors. Direct comparisons are difficult since the hardware, dataset sizes, and query patterns differ across benchmarks [14].
The Weaviate open-source project has accumulated over 14,000 GitHub stars and more than 5 million downloads. The company reports over 2,000 companies using Weaviate in production environments [1].
Weaviate provides official client libraries for Python, JavaScript/TypeScript, Go, and Java. The Python client is the most popular, with active development tracked in its changelog [15]. The project also maintains a recipes repository with example implementations for common use cases.
| Client library | Package | Status |
|---|---|---|
| Python | weaviate-client (PyPI) | Most popular; supports sync and async operations |
| JavaScript/TypeScript | weaviate-ts-client (npm) | Full-featured; used in Node.js and browser applications |
| Go | weaviate-go-client | Native Go client for Go applications |
| Java | io.weaviate:client (Maven) | Java client for JVM-based applications |
Integrations exist with LangChain, LlamaIndex, Semantic Kernel, Haystack, and other AI orchestration frameworks. These integrations allow developers to use Weaviate as the vector store component in larger machine learning pipelines.
The Weaviate community includes an active forum, a Slack workspace with thousands of members, and regular community calls. The company also runs Weaviate Academy, a free educational platform with structured courses on using the database, from zero-to-MVP guides to advanced performance optimization tutorials [9].
By early 2026, Weaviate continues to develop as both an open-source project and a commercial cloud service. The company has focused on several areas: improving multi-tenancy efficiency with features like tenant offloading to S3, GPU-accelerated index building through the NVIDIA partnership, and refinements to hybrid search relevance [6][10].
The vector database market has become increasingly competitive, with Weaviate, Qdrant, Milvus, Chroma, and Pinecone all vying for developer adoption. Weaviate differentiates itself through its combination of open-source availability, native hybrid search, automatic vectorization modules, and the GraphQL API. The open-core business model (free self-hosted, paid cloud) gives it flexibility that purely managed competitors like Pinecone cannot match, while the managed cloud option reduces friction for teams that do not want to operate their own infrastructure.
Weaviate's focus on developer experience, with features like built-in generative search and automatic vectorization, reflects a broader trend in the vector database space toward reducing the amount of glue code required to build AI applications. Rather than forcing developers to stitch together separate embedding services, vector stores, and LLM orchestration layers, Weaviate aims to handle much of this within the database itself.