Weaviate

Weaviate is an open-source vector database that stores both data objects and their vector embeddings, enabling a combination of vector similarity search with structured filtering, keyword retrieval, and integrated generative search. Written in Go and licensed under BSD-3-Clause, the system is designed for speed and reliability in artificial intelligence applications. The project began as an open-source initiative by Bob van Luijt in March 2016 and was later commercialized by SeMI Technologies, the Amsterdam-based company that he co-founded with Etienne Dilocker at the end of 2018. SeMI Technologies rebranded to Weaviate B.V. in January 2023 [1][2][3]. By early 2026, the open-source project had crossed 16,000 GitHub stars, and the company reported more than 5 million container downloads, an active community of over 50,000 builders, and adoption by more than 2,000 production users [4][5].

history and founding

The Weaviate project began in March 2016, when Dutch technology entrepreneur Bob van Luijt (born 15 November 1985 in Bergen op Zoom) started experimenting with the idea of a database where semantic relationships were a first-class citizen. He was working through a strategic design consultancy called Kubrickology, and his early thinking was shaped by two influences: the GloVe paper on word embeddings, which he encountered in 2015, and Google's Weave and Brillo "things" framework, which he first saw at the Ubiquity Conference in San Francisco in January 2016 [3][6].

In 2017, van Luijt published a blog post connecting Internet of Things concepts with the semantic web, sketching the framework that would later become Weaviate. By the end of 2018, he had entered a startup accelerator in the Netherlands and assembled the founding team that would become SeMI Technologies. The company name stood for Semantic Machine Insights. The original Weaviate codebase had been more of a traditional knowledge graph, but during the accelerator phase the team pivoted toward semantic search powered by vector embeddings, what they called the "Weaviate Search Graph" [3][6].

Etienne Dilocker joined as co-founder and Chief Technology Officer. Dilocker is a German engineer based in Mannheim with more than 15 years of experience in cloud-native systems, Go, databases, and site reliability. He had previously contributed to ANN-Benchmarks and ran his own consultancy, Dilocker Software Engineering. His research interests include distributed systems, auto-scaling databases, vector index design, and high-cardinality metadata filtering, the last of which he has explored in collaboration with the University of Pisa [7][8].

A third name often listed alongside the founders is Micha Verhagen, who joined the early team. SeMI Technologies originally spun out of work at ING Labs, the innovation arm of Dutch bank ING Group, which became one of the early investors in the project [1].

Version 1.0.0 of Weaviate was released on 14 January 2021. This release introduced the modular API that allows the database to plug into different embedding providers and other extensions. By that point the project had already moved away from being a niche knowledge graph and was positioning itself as a dedicated vector database for the emerging generative-AI era [9].

In January 2023, the company changed its legal name from SeMI Technologies to Weaviate B.V. According to a press release at the time, the open-source product had become significantly better known than the corporate entity, so the team decided to consolidate under the Weaviate brand [2].

funding history

Weaviate has raised approximately $67.7 million across three publicly disclosed funding rounds. The company has remained at the Series B stage as of early 2026 and has not announced any subsequent priced rounds [10][11].

Round	Date	Amount	Lead investor(s)	Other participants
Seed	August 2020	$1.2M	Zetta Venture Partners	ING Ventures
Series A	February 2022	$16M	Cortical Ventures, NEA	Zetta Venture Partners, ING Ventures
Series B	April 2023	$50M	Index Ventures	Battery Ventures, NEA, Cortical Ventures, Zetta Venture Partners, ING Ventures

The seed round in August 2020 came from Zetta Venture Partners and ING Ventures and provided the runway to take Weaviate from prototype to production [10]. The $16 million Series A in February 2022 was co-led by Cortical Ventures and New Enterprise Associates (NEA), with Zetta and ING Ventures continuing as participants. The Series A press release described Weaviate as "a new wave of AI-first database tech" [12][13].

The Series B closed on 21 April 2023 at $50 million, led by Index Ventures and joined by Battery Ventures alongside the prior investors. The round was driven by demand from teams building retrieval-augmented generation systems on top of large language models. Capital from the Series B has been used to expand the engineering team, build out Weaviate Cloud, and grow commercial operations across North America and Europe [11][14].

Reporting in late 2024 indicated that Weaviate had reached around $12.3 million in annual revenue with a team of approximately 104 employees, suggesting strong commercial traction without an immediate need for additional priced rounds [15].

headquarters and team

Weaviate is headquartered at Prinsengracht 769A in Amsterdam, Netherlands, the same canal-side district where the founding team began. The company is structured as remote-first, with engineers, researchers, and commercial staff distributed across Europe, North America, and other regions. Public estimates put headcount at roughly 90 to 105 people through 2024 and 2025 [15][16].

Key leadership disclosed publicly includes:

Role	Person
Chief Executive Officer	Bob van Luijt (co-founder)
Chief Technology Officer	Etienne Dilocker (co-founder)
Vice President of Engineering	Paul de Grijp
Director of Applied Research	John Trengrove
Head of People and Culture	Jessie de Groot

The company describes its values as Be Kind, Work Together as One, Strive for Excellence, Encourage Transparency, and Inspire Trust [16].

architecture

Weaviate's architecture is designed to handle multiple search paradigms (vector, keyword, and hybrid) within a single system. The database is written in Go, which the team chose for its concurrency model, low memory overhead, and strong tooling around binary builds. Roughly 97 percent of the codebase is Go, with a small amount of Python tooling for tests and developer scripts [4].

A Weaviate node combines several subsystems: a storage engine based on a custom log-structured merge-tree (LSM-tree), an inverted index for keyword search, one or more vector indexes per collection, a schema and metadata service, and a multi-protocol API surface (REST, gRPC, and GraphQL). Cluster coordination uses the Raft consensus algorithm, while object data is replicated using a leaderless eventually consistent model [17][18].

HNSW index

Weaviate uses HNSW (Hierarchical Navigable Small World) as its primary vector index algorithm. HNSW is a graph-based approach that searches vectors by navigating through multiple layers, moving from coarse approximations to fine ones. Its complexity grows logarithmically rather than linearly with the dataset size, which makes it effective even at billions of vectors [19].

Each named vector in Weaviate maintains its own vector index, which can be either an HNSW graph or a flat index. The HNSW graph is held in memory for fast access, with commit logs and snapshots used to restore the structure after a restart. From version 1.31, Weaviate added HNSW snapshotting, which produces periodic on-disk snapshots that can be loaded directly at startup. According to the release notes, this can reduce startup time for large indexes by approximately 10 to 15 times compared with replaying the write-ahead log [9][20].

In single-instance mode, Weaviate's in-memory HNSW index handles millions of vectors with sub-second response times. For larger datasets, cluster mode distributes data across nodes with sharding and replication, enabling horizontal scaling to billions of data points [19].

Weaviate exposes several HNSW tuning parameters that allow developers to optimize the tradeoff between recall, speed, and memory:

Parameter	Default	Description
efConstruction	128	Size of dynamic candidate list during index construction; higher values improve graph quality at the cost of slower builds
maxConnections	64	Maximum number of connections per node in the HNSW graph; higher values improve recall but increase memory usage
ef	-1 (dynamic)	Size of the candidate list during search; controls the recall-latency tradeoff at query time
vectorCacheMaxObjects	1 trillion	Maximum number of vectors to cache in memory; reducing this value helps when the dataset exceeds available RAM
flatSearchCutoff	40,000	Collections smaller than this threshold use brute-force (flat) search instead of HNSW

The flat index option is useful for small collections or multi-tenant environments where each tenant has a relatively small dataset. Instead of building and maintaining an HNSW graph, flat indexing stores vectors in a simple array and performs brute-force search, which is faster for small collections due to lower overhead [19].

vector quantization

Weaviate supports several quantization techniques for compressing vectors and reducing memory consumption. Quantization is configured per HNSW (or flat) index and can be combined with rescoring, where the database re-ranks the top candidates using their full-precision vectors to recover most of the lost recall [21].

Technique	Bit-width	Memory saving	Notes
Product Quantization (PQ)	Variable	About 85%	Splits vectors into segments and quantizes each segment using k-means codebooks
Scalar Quantization (SQ)	8 bits per dimension	About 75%	Maps each 32-bit float dimension into one of 256 buckets, trained on the data distribution
Binary Quantization (BQ)	1 bit per dimension	About 97%	Stores each dimension as a single bit; fastest but least precise
Rotational Quantization (RQ)	8 or 1 bits	75% to ~94%	Applies a rotation to decorrelate dimensions before quantization, improving recall over SQ and BQ

Rotational quantization (RQ) was introduced in version 1.32 (July 2025) as the recommended default for new collections. A 1-bit RQ variant, providing extreme compression rates while preserving most of the search quality, became a preview feature in version 1.33 (October 2025). Each quantization mode preserves a copy of the uncompressed vectors so that rescoring can refine the final top-k results [22][23].

GPU-accelerated indexing

Through a partnership with NVIDIA, Weaviate supports GPU-accelerated index building using the cuVS library. The CAGRA algorithm builds indexes on the GPU, then converts them to HNSW format for cost-effective CPU-based query serving. This approach speeds up index construction while keeping query costs low [24].

inverted index for keyword search

In addition to the HNSW vector index, Weaviate maintains an inverted index based on BM25F scoring for keyword search. The BM25F algorithm extends the standard BM25 scoring function with field-level weighting, so different properties of an object (such as title versus body text) can contribute differently to keyword relevance scores. This inverted index powers Weaviate's keyword search and is also used as one component of hybrid search [25][26].

From version 1.30, Weaviate replaced the previous BM25 implementation with BlockMax WAND, an algorithm that allows the database to skip large blocks of postings during scoring. In Weaviate's internal benchmarks, BlockMax WAND reduced BM25 query latency by an order of magnitude on large corpora compared with the previous traversal strategy [27].

hybrid search (BM25 + vector)

Weaviate's hybrid search combines dense vector similarity with BM25 keyword matching in a single native API call. According to Weaviate's 2025 benchmarks, hybrid search improved NDCG@10 by 42 percent over pure vector search across a panel of information-retrieval datasets, which is particularly relevant for retrieval-augmented generation workloads where recall directly affects the quality of generated answers [25].

The system fuses dense vector embeddings with sparse BM25 lexical matching, and developers can control the weighting between the two methods. Independent evaluations report 35 to 50 percent relevance improvements from hybrid search compared to either method alone [28].

Weaviate offers two fusion algorithms for combining vector and keyword results:

Fusion algorithm	Description	Best for
Ranked Fusion	Combines results by summing the inverse of each result's rank in the individual search lists	General-purpose hybrid queries
Relative Score Fusion	Normalizes scores from each search method to [0, 1] range and combines them	When the raw score distributions of the two methods differ significantly

The alpha parameter controls the balance between vector search (alpha = 1.0) and keyword search (alpha = 0.0). Setting alpha to 0.5 weights both methods equally. In practice, values between 0.7 and 0.8 (favoring vector search slightly) tend to perform well for RAG workloads [25][26].

multi-vector embeddings and MUVERA

From version 1.31 (June 2025), Weaviate added support for multi-vector embeddings using MUVERA encoding. Multi-vector models such as ColBERT represent each document as a set of token-level vectors rather than one pooled vector, which can improve retrieval accuracy at the cost of storage. MUVERA provides a fixed-dimension single-vector representation that approximates the late-interaction scoring used by ColBERT, allowing Weaviate to plug multi-vector models into its standard HNSW pipeline without exotic data structures [20].

replication and consistency

From version 1.25 onward, Weaviate uses the Raft consensus algorithm for cluster metadata, including schema operations, role-based access control changes, and tenant lifecycle events. Raft is implemented using the HashiCorp Raft library; one node is elected leader, and metadata changes are committed once a quorum of nodes acknowledges them. This was a significant change from earlier versions, which had used a simpler gossip-style protocol that could not safely support concurrent schema operations [17].

Object data replication uses a leaderless, eventually consistent model with tunable consistency levels (ONE, QUORUM, ALL) at read and write time. Async replication, asynchronous shard replica movement between nodes, and rebalancing capabilities have all been added across the 1.25 through 1.32 release series, gradually maturing Weaviate into a distributed system suitable for production workloads at scale [17][22].

module system

Weaviate's module system is one of its defining architectural features. Modules extend the database with additional capabilities at various stages of the data pipeline: vectorization (generating embeddings from raw data), generative AI (producing text from search results), reranking (refining result order), and reference resolution (loading data from external sources).

Modules are loaded at server startup via configuration. In Docker deployments, modules are specified as environment variables. In Kubernetes, they are configured in the Helm chart values. Weaviate Cloud instances come with a default set of modules pre-configured [29].

vectorizer modules

Vectorizer modules automatically generate embeddings when objects are inserted into Weaviate. Instead of computing embeddings externally and uploading them, users configure a vectorizer module on a collection and Weaviate handles embedding generation transparently.

Module	Provider	Model type	Key notes
text2vec-openai	OpenAI	Text	Uses OpenAI's embedding API (text-embedding-ada-002, text-embedding-3-small, etc.)
text2vec-cohere	Cohere	Text	Uses Cohere's multilingual embedding models
text2vec-huggingface	Hugging Face	Text	Uses Hugging Face Inference API for hosted models
text2vec-transformers	Self-hosted	Text	Runs transformer models locally in a sidecar container
text2vec-ollama	Ollama	Text	Runs local embedding models through an Ollama instance
text2vec-google	Google	Text	Uses Google's embedding models (Vertex AI, PaLM)
text2vec-aws	AWS	Text	Uses Amazon Bedrock embedding models
text2vec-jinaai	Jina AI	Text	Uses Jina AI's embedding models
text2vec-voyageai	Voyage AI	Text	Uses Voyage AI's domain-specific embeddings
text2vec-snowflake	Snowflake Arctic	Text	Uses Snowflake's open-source Arctic embedding family
multi2vec-clip	OpenAI CLIP	Multi-modal	Embeds both text and images into a shared vector space
multi2vec-bind	ImageBind	Multi-modal	Meta's ImageBind for text, image, audio, and video embeddings
multi2vec-cohere	Cohere	Multi-modal	Cohere Embed v4 multimodal text and image embeddings

Users can also import pre-computed vector embeddings directly, bypassing the module system entirely. This is the recommended path for users who already maintain a centralized embedding pipeline outside the database [29].

generative modules

Generative modules connect Weaviate to large language models for retrieval-augmented generation (RAG) directly within the database.

Module	Provider	Description
generative-openai	OpenAI	Uses GPT-4, GPT-3.5-turbo, and newer OpenAI chat models
generative-anthropic	Anthropic	Uses Claude family models
generative-cohere	Cohere	Uses Cohere's Command models
generative-aws	AWS	Uses Amazon Bedrock models (Claude, Titan, Llama)
generative-google	Google	Uses Google's Gemini and PaLM models
generative-anyscale	Anyscale	Uses open-source models hosted on Anyscale
generative-mistral	Mistral	Uses Mistral chat and instruct models
generative-friendliai	FriendliAI	Hosted open-source LLM endpoints
generative-contextual	Contextual AI	Generation tuned for grounded enterprise RAG

A generative search query in Weaviate consists of two parts: a search query (vector, keyword, or hybrid) and a prompt for the language model. Weaviate first retrieves relevant objects, then passes both the search results and the prompt to the configured generative model, returning the generated response alongside the original search results [30].

reranker modules

Reranker modules apply a second-pass ranking model after the initial retrieval step. Cross-encoder reranking models score each retrieved result against the original query, which typically produces more accurate relevance judgments than the initial bi-encoder similarity search. Available reranker modules include reranker-cohere, reranker-voyageai, reranker-jinaai, reranker-contextual, and reranker-transformers [29].

schema and collection definition

Weaviate organizes data into collections (previously called "classes"). Each collection defines a schema that specifies the properties of its objects, the vectorizer to use, the generative module (if any), and index configuration.

A collection definition includes:

Properties: Named fields with data types (text, int, number, boolean, date, object, geoCoordinates, phoneNumber, blob, uuid, and arrays of these types).
Vectorizer configuration: Which vectorizer module to use and which properties to vectorize.
Generative module configuration: Which LLM module to use for RAG queries.
Vector index configuration: HNSW or flat index, with tuning parameters and quantization options.
Inverted index configuration: Settings for keyword search, including BM25 parameters and stopword configuration.
Replication configuration: Number of replicas for the collection.
Multi-tenancy configuration: Whether to enable tenant isolation.

Weaviate also supports auto-schema, which is enabled by default. When auto-schema is active, Weaviate infers collection definitions from the data being inserted, automatically detecting property names and types. This is convenient for prototyping but should be disabled in production for predictable behavior [31].

Cross-references allow linking objects across collections. For example, an Article collection might reference objects in an Author collection, enabling graph-like traversals in queries.

From version 1.32, Weaviate added collection aliases, which let operators point an alias name at different underlying collections. This makes zero-downtime migrations practical: a new collection is built behind a fresh name, the alias is swapped to it once the data is ready, and the old collection can be dropped without changing client code [22].

key features

multi-tenancy

Weaviate implements native multi-tenancy with one shard per tenant, dynamic resource management, and true data isolation. This design means each tenant's data is physically separated, not just logically filtered. Recent updates introduced tenant offloading to S3 cloud storage and renamed the tenant activity statuses from HOT/COLD to ACTIVE/INACTIVE for clarity [32].

Multi-tenancy is useful for SaaS applications where each customer needs their own isolated dataset but the application operator does not want to manage separate database instances for each customer.

Weaviate's multi-tenancy implementation supports three tenant states:

State	Data location	Cost	Access speed
ACTIVE (formerly HOT)	In memory and on local disk	Highest	Fastest
INACTIVE (formerly COLD)	On local disk only	Medium	Moderate (requires activation)
OFFLOADED	In S3 cloud storage	Lowest	Slowest (requires reactivation to local disk)

This tiered approach allows operators to manage costs by offloading inactive tenants to cheaper storage while keeping frequently accessed tenants in memory for low-latency queries. Reference customers such as Instabase operate clusters with more than 50,000 tenants, illustrating that the design holds up at significant scale [32][33].

GraphQL, REST, and gRPC APIs

Weaviate exposes three APIs: REST, GraphQL, and gRPC. The GraphQL API is particularly distinctive among vector databases, since most competitors offer only REST or gRPC interfaces. GraphQL allows clients to specify exactly which fields they need, reducing over-fetching and making it easier to build flexible query interfaces [34].

A typical Weaviate GraphQL query can combine vector search, keyword filters, and property selection in a single request:

{
  Get {
    Article(
      hybrid: {
        query: "machine learning applications"
        alpha: 0.75
      }
      limit: 5
    ) {
      title
      content
      _additional {
        score
        distance
      }
    }
  }
}

The alpha parameter controls the balance between vector search (1.0) and keyword search (0.0).

The gRPC interface, introduced incrementally between version 1.19 and version 1.23.7, is the preferred protocol for high-throughput workloads. Internal benchmarks comparing the gRPC and REST/GraphQL paths on identical hardware showed query latency reductions of 40 to 70 percent and import-time reductions of close to 50 percent on the DBPedia benchmark dataset [35]. The gRPC API uses HTTP/2 and Protocol Buffers, which produce smaller, faster-to-parse payloads than JSON over HTTP/1.1.

A short example using the v4 Python client illustrates a hybrid query:

import weaviate
from weaviate.classes.query import HybridFusion

client = weaviate.connect_to_local()
articles = client.collections.get("Article")

results = articles.query.hybrid(
    query="vector databases for RAG",
    alpha=0.75,
    fusion_type=HybridFusion.RELATIVE_SCORE,
    limit=5,
)

for obj in results.objects:
    print(obj.properties["title"], obj.metadata.score)

client.close()

generative search (RAG)

Weaviate includes built-in generative search capabilities, allowing users to perform retrieval-augmented generation directly within the database. After retrieving relevant objects through vector or hybrid search, Weaviate can pass those results to a configured language model (such as OpenAI's GPT-4, Anthropic's Claude, or Cohere's models) to generate answers, summaries, or transformations. This eliminates the need for a separate orchestration layer for basic RAG use cases [30].

Generative search supports two modes:

Single-prompt: A prompt is applied to each individual search result. For example, summarizing each retrieved document.
Grouped-task: All search results are passed together to the model in a single prompt. For example, generating a comprehensive answer that synthesizes information from multiple retrieved documents.

reranking

Integrated reranking modules let users apply a second-pass ranking model after the initial retrieval step. Reranking models score each retrieved result against the original query using a cross-encoder, which typically produces more accurate relevance judgments than the initial bi-encoder similarity search.

named vectors

Weaviate supports multiple named vectors per object, allowing a single object to have separate embeddings for different properties or different embedding models. For example, a product listing might have one vector for its title (embedded with a lightweight model) and another for its description (embedded with a more powerful model). Queries can target specific named vectors, enabling flexible multi-modal and multi-representation search strategies.

authentication, RBAC, and security

Weaviate supports API-key, OIDC, and anonymous authentication modes, configured via environment variables on each node. From the 1.29/1.30 release series, the database includes a built-in role-based access control (RBAC) system, with predefined roles for root and viewer plus user-defined roles that pin down permissions at the level of individual collections, tenants, and operations [36].

From version 1.34, observability capabilities expanded with more than 30 new monitoring metrics, covering LSM bucket reads and writes, write-ahead log recovery, memtable flushing, and asynchronous replication. These metrics are exposed in Prometheus format and feed into Weaviate Cloud's standard monitoring dashboards [37].

Weaviate Cloud and deployment options

Weaviate offers several ways to run the database, from fully self-managed to fully hosted. The core engine is identical across deployment modes [38][39].

Deployment	Operator	Notes
Self-hosted (Docker)	User	Single-node or small clusters; ideal for development
Self-hosted (Kubernetes)	User	Helm chart; the recommended path for production self-hosting
Embedded Weaviate	User	Library mode that starts a local Weaviate process from inside Python or Node.js scripts
Weaviate Cloud (Serverless)	Weaviate	Multi-tenant shared cloud, billed by usage
Weaviate Cloud (Enterprise)	Weaviate	Dedicated single-tenant clusters with SLA, HIPAA on AWS
BYOC (Bring Your Own Cloud)	Joint	Database runs in customer's VPC; Weaviate manages the control plane
AWS Marketplace	User or Joint	Container deployment inside customer AWS account

Weaviate restructured its pricing in October 2025, replacing earlier per-dimension billing with three usage-based dimensions (data, performance, and AI features) plus a $45/month minimum to cover baseline cluster costs. The Cloud product is offered in tiered plans called Flex, Plus, and Premium [40].

Plan	Starting price	Description
Sandbox (Free)	$0	14-day trial clusters for experimentation
Flex	$45/month minimum	Shared infrastructure, billed by usage
Plus	$280/month	Higher resource ceiling on shared infrastructure
Premium	$400+/month	Dedicated resources and priority support
Enterprise Cloud	Custom	Single-tenant clusters with SLA, HIPAA, BYOC options
BYOC	From ~$1,390/month	Database runs in customer's cloud, Weaviate operates the control plane

The BYOC offering is targeted at enterprises with data-residency, regulatory, or networking constraints. The cluster runs inside the customer's AWS, GCP, or Azure VPC on managed Kubernetes, while Weaviate handles application-level security, configuration, upgrades, patches, and 24/7 monitoring. This design separates the control plane (Weaviate's responsibility) from the data plane (customer's environment) [38].

Weaviate Cloud also runs HIPAA-eligible Enterprise Cloud workloads on AWS, with parallel work in progress for Azure and GCP. The same engine powers integrations with the AWS Marketplace, where users can deploy Weaviate as a containerized cluster inside their AWS tenant.

Weaviate Embeddings

Weaviate Embeddings, launched in December 2024, is a managed embedding service that runs inside Weaviate Cloud. The service offers GPU-backed inference for both open-source and proprietary models, removing the need for users to call external APIs or operate their own embedding infrastructure [41][42].

The service launched with two Snowflake Arctic models, snowflake-arctic-embed-m-v1.5 (English-only, 512 tokens) and snowflake-arctic-embed-l-v2.0 (multilingual, 8192 tokens). Pricing is consumption-based, and the service has no hard cap on embeddings per second, making it suitable for production-scale ingestion. Additional models and modalities have been added on a rolling basis since 2025 [41][42].

Weaviate Agents

In March 2025, Weaviate introduced a new product line called Weaviate Agents: pre-built agentic services that ship with Weaviate Cloud. Three agents have been released [43][44].

Agent	First preview	Status	Purpose
Query Agent	March 2025	GA September 2025	Natural-language access to Weaviate collections, including multi-stage queries and aggregations
Transformation Agent	March 2025	Preview	Uses an LLM to rewrite, enrich, or augment objects already in a collection based on natural-language instructions
Personalization Agent	April 2025	Preview	Re-ranks results for a user based on a stored persona and prior interactions

The Query Agent reads a Weaviate cluster's schema, decides which collections to query, generates the appropriate vector or aggregation query, and returns a natural-language answer with citations to the underlying objects. It became generally available in September 2025 for Serverless customers [43][45].

In February 2026, Weaviate also announced Agent Skills, a developer toolkit that helps coding agents (such as those built on the Model Context Protocol) understand Weaviate APIs and write correct queries against them [46].

Verba

Verba is Weaviate's open-source RAG reference application. The repository, hosted on GitHub at weaviate/Verba, provides a modular chatbot interface that ingests documents from sources such as PDFs (via UnstructuredIO), GitHub repositories, and Markdown files, then answers questions using Weaviate as the retrieval back-end and a configurable LLM as the generation step. As of early 2026, the project had accumulated over 7,000 stars and was widely used as a starting point for self-hosted RAG demos and internal tools [47][48].

Verba supports four deployment modes: a fully embedded mode using Weaviate Embedded, a Docker mode, a connection to an existing Weaviate Cloud cluster, and a custom URL mode pointing at any reachable Weaviate instance. The default model stack is intentionally swappable: SentenceTransformers for embeddings, OpenAI or Ollama for generation, and either Weaviate Cloud or local Weaviate for storage.

release history

Weaviate ships approximately one minor release every one to two months. The table below highlights selected releases that introduced major capabilities. Weaviate's stated support policy is to maintain bug fixes and security patches for the latest three minor versions [9][20][22][23][37].

Version	Release date	Highlights
1.0.0	January 14, 2021	Initial GA, modular API, support for importing pre-computed vectors
1.2.0	2021	text2vec-transformers module
1.5.0	2021	LSM-tree storage, auto-schema
1.8.0	2021	Horizontal scalability, sharding, pagination
1.18.0	2023	Hybrid search GA, Multi-tenancy preview
1.19.0	2023	gRPC interface (initial), generative-cohere
1.22.0	2023	Multi-tenancy GA, named vectors preview
1.23.0	2024	gRPC GA in v1.23.7, dynamic ef tuning
1.24.0	2024	Named vectors GA, partial flat-index support
1.25.0	2024	Raft consensus for cluster metadata
1.26.0	2024	Async replication, RBAC preview
1.27.0	2024	Multi-vector preview, ColBERT support
1.28.0	2024	Snowflake Arctic embeddings, query speed-ups
1.29.0	February 2025	RBAC GA, Weaviate Embeddings service integration
1.30.0	April 2025	Runtime configuration, dynamic user management, BlockMax WAND BM25
1.31.0	June 2025	MUVERA multi-vector encoding, HNSW snapshots
1.32.0	July 2025	Collection aliases, Rotational Quantization (RQ), shard replica movement GA
1.33.0	October 2025	1-bit RQ preview, ContainsNone and Not filter operators
1.34.0	November 2025	Flat-index plus RQ, expanded observability metrics, Contextual AI integration
1.35.0	December 2025	Stability and storage improvements
1.36.0	February 2026	Latest stable release for new clusters
1.37.0	April 2026	Secure MCP server, extensible tokenizers, incremental backups

performance benchmarks

Weaviate publishes benchmark results for its search modes. In 2025, the team benchmarked hybrid search against pure vector search and pure keyword search across 12 information retrieval datasets from BEIR, LoTTe, and BRIGHT, as well as WixQA and EnronQA.

Key findings from these benchmarks include:

Search mode	Relative performance	Notes
Pure vector search	Baseline	Uses dense embeddings only
Pure BM25 keyword search	Varies by dataset	Better for exact keyword matching; weaker for semantic queries
Hybrid search (BM25 + vector)	+42% NDCG@10 vs pure vector	Combining both methods yields the strongest results across most datasets
Search Mode (auto-optimized)	Best overall	Weaviate's automatic search mode selection further improves results [49]

In terms of raw throughput, optimized Weaviate deployments have been benchmarked at approximately 10,000 to 15,000 queries per second for pure vector search workloads. Latency numbers depend heavily on dataset size, dimensionality, and hardware configuration [50]. For ingestion, the move from REST to gRPC roughly halved the time to insert the 1 million records of the DBPedia benchmark dataset, from over 42 minutes to about 23 minutes on the same hardware [35].

comparison with other vector databases

Weaviate is most often compared with Pinecone, Qdrant, Chroma, and Milvus. The competitive picture is shaped by license model (open source versus proprietary), deployment options, hybrid search capabilities, and the level of integration with embedding and generative providers [50][51].

versus Pinecone

Feature	Weaviate	Pinecone
Source model	Open-source (BSD-3-Clause)	Closed-source, managed SaaS
Self-hosting	Yes (Docker, Kubernetes, BYOC)	No (cloud-only, or BYOC on Dedicated plan)
Hybrid search	Native BM25 + vector in one API call	Requires separate sparse vector index
API style	REST, gRPC, GraphQL	REST, gRPC
Auto-vectorization	Yes, via modules and Weaviate Embeddings	Yes, via Pinecone Inference
Multi-tenancy	Native, one shard per tenant	Via namespaces
Generative search	Built-in RAG modules	Via Pinecone Assistant
Primary language	Go	Not disclosed (managed service)

Pinecone tends to win on setup simplicity and consistent low-latency performance for pure vector search workloads. Weaviate tends to win on flexibility, hybrid search quality, and cost control through self-hosting options. For workloads where hybrid search is central to the application, Weaviate's native BM25 + vector fusion in a single API call is more efficient than Pinecone's approach of maintaining separate sparse and dense indexes [50].

Performance benchmarks from 2025 show Weaviate achieving 10,000 to 15,000 queries per second with optimized configuration, while Pinecone's Dedicated Read Nodes achieved 5,700 QPS at P99 60ms on 1.4 billion vectors. Direct comparisons are difficult since the hardware, dataset sizes, and query patterns differ across benchmarks [50].

versus Qdrant

Feature	Weaviate	Qdrant
Source model	Open-source (BSD-3-Clause)	Open-source (Apache 2.0)
Primary language	Go	Rust
Hybrid search	Native (BM25 + vector)	Native (sparse + dense)
GraphQL	Yes	No
Modular embedding/generation	Extensive module catalog	Smaller, focused integration list
Sharding model	One shard per tenant or hash-based	Hash-based
Filtering optimization	Inverted index plus dynamic filter strategy	Custom filterable HNSW

Qdrant emphasizes raw performance, especially around payload-heavy filtered search, and is often praised for its memory footprint due to its Rust implementation. Weaviate emphasizes a richer feature surface (modules, agents, Weaviate Embeddings, generative search). Both are viable open-source options at production scale, and many teams pick between them based on whether they prefer Go or Rust on their infrastructure team [50][51].

versus Milvus

Milvus, maintained by Zilliz under the Apache 2.0 license, is another open-source vector database focused on raw performance and a broad menu of index types (HNSW, IVF, DiskANN, GPU indexes). Compared with Milvus, Weaviate ships with a thinner index menu (HNSW and flat) but a richer module catalog and integrated generative search. Milvus tends to perform better on extremely large pure-vector workloads with custom index choices, while Weaviate tends to be easier to integrate with embedding and LLM providers out of the box [51].

versus Chroma and pgvector

Chroma is a developer-friendly embedding database aimed mostly at prototyping; it does not target the same scale or feature set as Weaviate. pgvector is a PostgreSQL extension that adds vector indexing to an existing relational database; it is a strong choice when an application already uses PostgreSQL and needs only modest vector-search functionality. Weaviate occupies a middle ground: more capable than Chroma, more specialized than pgvector, and open-source unlike Pinecone [51].

customers and case studies

Weaviate publishes case studies on its website covering production deployments. Notable users include [37][33][52][53]:

Company	Use case	Notes
Stack Overflow	Hybrid search for OverflowAI features on Stack Overflow's question and answer corpus	Required an open-source vector database that could run on existing Azure infrastructure
Instabase	Document intelligence for mortgage, insurance, and other regulated industries	Processes more than 500,000 documents per day across more than 50,000 tenants in a single Weaviate cluster, ingesting more than 450 distinct data types
Stack AI	Enterprise no-code AI platform with retrieval-augmented assistants	Highlighted in Weaviate's case study library
Morningstar	Financial research and analyst tooling	Cited as a customer in Weaviate's commercial materials
Cisco	Internal knowledge and search applications	Cited as a customer
Bunq	European mobile-first bank using Weaviate for AI-driven search	Cited as a customer
Red Hat	Internal search and document retrieval	Cited as a customer in joint AWS materials
NetApp	Data infrastructure for enterprise AI workloads	Cited as a customer in joint AWS materials
MetaBuddy	AI coaching platform	Reported a 60 percent reduction in trainer analysis time and triple the user engagement after migrating to Weaviate

In April 2025, Weaviate was named one of the AWS Partner of the Year recipients in the EMEA region for its work in generative AI infrastructure on AWS Marketplace [54].

ecosystem and community

The Weaviate open-source project has accumulated over 16,000 GitHub stars and more than 5 million container downloads. The company reports a community of more than 50,000 AI builders and over 2,000 production users [4][5][16].

Weaviate provides official client libraries for Python, JavaScript/TypeScript, Go, Java, and C#/.NET. The Python client is the most popular, with active development tracked in its changelog [55].

Client library	Package	Status
Python	`weaviate-client` (PyPI)	Most popular; supports sync and async operations on top of gRPC
JavaScript/TypeScript	`weaviate-client` (npm)	Used in Node.js and browser applications
Go	`weaviate-go-client`	Native Go client for Go applications
Java	`io.weaviate:client` (Maven)	Java client for JVM-based applications
C#/.NET	`Weaviate.Client` (NuGet)	Official client added in 2024

Integrations exist with LangChain, LlamaIndex, Semantic Kernel, Haystack, Dify, Embedchain, and other AI orchestration frameworks. These integrations allow developers to use Weaviate as the vector store component in larger machine learning pipelines.

The Weaviate community includes an active forum, a Slack workspace with thousands of members, and regular community calls. Engagement levels are high, with the company reporting more than 200 messages per week in its open Slack and Discourse forum. The Weaviate Hero program recognizes members who contribute through speaking, blogging, code, and community moderation [56][57].

The company also runs Weaviate Academy, a free educational platform with structured courses on using the database, from zero-to-MVP guides to advanced performance optimization tutorials, and the Weaviate World Tour series of in-person workshops, conferences, and hackathons across Europe and North America [29][57].

current state (2025-2026)

By May 2026, Weaviate continues to develop as both an open-source project and a commercial cloud service. The company has focused on several areas: improving multi-tenancy efficiency with features like tenant offloading to S3, GPU-accelerated index building through the NVIDIA partnership, native multi-vector embeddings with MUVERA, more efficient quantization through RQ, and the rollout of agentic services such as the Query Agent, Transformation Agent, and Personalization Agent [22][24][32][43].

The vector database market has become increasingly competitive, with Weaviate, Qdrant, Milvus, Chroma, and Pinecone all vying for developer adoption, and incumbents such as Oracle, Microsoft Azure (with Azure AI Search), MongoDB, and PostgreSQL (via pgvector) adding vector capabilities to their existing databases. Weaviate differentiates itself through its combination of open-source availability, native hybrid search, automatic vectorization modules, and the GraphQL API. The open-core business model (free self-hosted, paid cloud) gives it flexibility that purely managed competitors like Pinecone cannot match, while the managed cloud option reduces friction for teams that do not want to operate their own infrastructure [50][51].

Weaviate's stated direction for 2026 emphasizes deeper agentic capabilities, including reasoning workflows, broader multimodal support, model evaluation tools, and shared memory across agents. CEO Bob van Luijt has described "agentic architectures" and "generative feedback loops" as the next inflection point in data management, comparable in his view to the arrival of the public cloud or the modern web [58][59].

Weaviate's focus on developer experience, with features like built-in generative search, automatic vectorization, integrated agents, and BYOC deployments, reflects a broader trend in the vector database space toward reducing the amount of glue code required to build AI applications. Rather than forcing developers to stitch together separate embedding services, vector stores, and LLM orchestration layers, Weaviate aims to handle much of this within the database itself.

references

SeMI Technologies - Crunchbase Company Profile and Funding - Crunchbase
SeMI Technologies becomes Weaviate - PR Newswire, 18 January 2023
The History of Weaviate - Weaviate Blog, January 2021
weaviate/weaviate on GitHub - GitHub
The AI database developers love - Weaviate
The History of the Weaviate Vector Search Engine - Medium / SeMI Technologies
Etienne Dilocker - Co-founder and CTO - Weaviate Careers
Etienne Dilocker on GitHub - GitHub
Release Notes - Weaviate Documentation
Weaviate - Crunchbase Company Profile and Funding - Crunchbase
Weaviate Raises $50 Million Series B Funding - PR Newswire, 21 April 2023
SeMI Technologies $16M Series A Round Highlights a New Wave of AI-first Database Tech - PR Newswire, February 2022
SeMI Technologies' search engine opens up new ways to query your data - TechCrunch, 22 February 2022
Index Ventures Leads $50 Million Investment in AI Startup Weaviate - The Information, April 2023
How Weaviate hit $12.3M revenue with a 104 person team in 2024 - GetLatka
About Us - Weaviate
Replication Architecture - Weaviate Documentation
Weaviate 1.25 Release - Weaviate Blog
Weaviate Vector Index documentation - Weaviate Documentation
Weaviate 1.31 Release - Weaviate Blog
Compression (Vector Quantization) - Weaviate Documentation
Weaviate 1.32 Release - Weaviate Blog
Weaviate 1.33 Release - Weaviate Blog
Unleashing AI Factories: Weaviate and NVIDIA Turbocharge Vector Search - Weaviate Blog
Hybrid Search - Weaviate Documentation
Hybrid Search Explained - Weaviate Blog
Weaviate 1.30 Release: Runtime config, BlockMax WAND BM25 - GitHub Releases
Weaviate GraphQL: Python Hybrid Search Implementations 2026 - Johal.in
Weaviate Modules documentation - Weaviate Documentation
Generative Search documentation - Weaviate Documentation
How to define a schema - Weaviate Academy
Rethinking Vector Search at Scale: Weaviate's Multi-Tenancy - Weaviate Blog
Case Study - Instabase - Weaviate
GraphQL API documentation - Weaviate Documentation
gRPC-Driven Performance Improvements in Weaviate - Weaviate Blog
Setting up RBAC in Weaviate - Weaviate Documentation
Weaviate 1.34 Release - Weaviate Blog
BYOC - Bring Your Own Cloud - Weaviate
Weaviate Cloud documentation - Weaviate Documentation
Weaviate Pricing in 2026 - Particula Tech
Introducing Weaviate Embeddings - Weaviate Blog
Weaviate Launches Flexible Embedding Service for AI Development - GlobeNewswire, 3 December 2024
Welcome to the Next Era of Data and AI: Meet Weaviate Agents - Weaviate Blog
Weaviate Goes Full Stack With Launch of Weaviate Agents for AI Development - GlobeNewswire, 4 March 2025
Accelerating Data Workflows with Query Agent, now GA - Weaviate Blog
Weaviate Launches Agent Skills to Empower AI Coding Agents - GlobeNewswire, 21 February 2026
weaviate/Verba on GitHub - GitHub
Verba: Building an Open Source, Modular RAG Application - Weaviate Blog
Search Mode Benchmarking - Weaviate Blog
Pinecone vs Weaviate 2026: Engineered Decision Guide - RankSquire
Best Vector Databases in 2026: A Complete Comparison Guide - Firecrawl
From prototype to production: Vector databases in generative AI applications - Stack Overflow Blog
Case Studies - Weaviate
Announcing the Regional 2025 AWS Partners of the Year for Europe, Middle East, and Africa - AWS Partner Network Blog
Weaviate Python Client Changelog - Read the Docs
Weaviate Hero - Weaviate Blog
Weaviate community - Weaviate
Bob van Luijt on AI-Native Apps, Open Source, and Innovation - Analytics Vidhya
Weaviate in 2025: Reliable Foundations for Agentic Systems - Weaviate Blog

history and founding

funding history

headquarters and team

architecture

HNSW index

vector quantization

GPU-accelerated indexing

inverted index for keyword search

hybrid search (BM25 + vector)

multi-vector embeddings and MUVERA

replication and consistency

module system

vectorizer modules

generative modules

reranker modules

schema and collection definition

key features

multi-tenancy

GraphQL, REST, and gRPC APIs

generative search (RAG)

reranking

named vectors

authentication, RBAC, and security

Weaviate Cloud and deployment options

Weaviate Embeddings

Weaviate Agents

Verba

release history

performance benchmarks

comparison with other vector databases

versus Pinecone

versus Qdrant

versus Milvus

versus Chroma and pgvector

customers and case studies

ecosystem and community

current state (2025-2026)

references

Improve this article

Related Articles

Chroma

pgvector

Qdrant

Milvus

Open-source AI

MCP server

history and founding

funding history

headquarters and team

architecture

HNSW index

vector quantization

GPU-accelerated indexing

inverted index for keyword search

hybrid search (BM25 + vector)

multi-vector embeddings and MUVERA

replication and consistency

module system

vectorizer modules

generative modules

reranker modules

schema and collection definition

key features

multi-tenancy

GraphQL, REST, and gRPC APIs

generative search (RAG)

reranking

named vectors

authentication, RBAC, and security

Weaviate Cloud and deployment options

Weaviate Embeddings

Weaviate Agents

Verba

release history

performance benchmarks

comparison with other vector databases

versus Pinecone

versus Qdrant

versus Milvus

versus Chroma and pgvector