Chroma

Chroma is an open-source embedding database designed for artificial intelligence applications. It stores vector embeddings alongside their associated documents and metadata, then allows querying by nearest-neighbor similarity rather than traditional substring matching. Chroma was founded in 2022 by Jeff Huber and Anton Troynikov in San Francisco, and has become one of the most popular vector databases for prototyping retrieval-augmented generation (RAG) applications, largely because of its emphasis on simplicity: a working instance can be set up with pip install chromadb and a few lines of Python [1].

The project is distributed under the Apache 2.0 License and is developed primarily in Rust, Python, TypeScript, and Go. By May 2026 the chroma-core/chroma repository on GitHub had passed 27,900 stars and 2,200 forks, and the package was being downloaded more than 15 million times per month across the Python and JavaScript ecosystems [2][3]. The company behind the project, Chroma Inc., has raised approximately $20 million across a pre-seed and seed round, and operates a managed service called Chroma Cloud which reached general availability in 2025 [4][5].

history and founding

founders

Jeff Huber and Anton Troynikov founded Chroma in 2022 and incorporated the company in San Francisco. The pair had each spent the prior decade applying machine learning to physical-world products and had independently arrived at the same observation: while machine learning demos were easy to build, taking anything with a model in it from prototype to production felt, in Huber's words, "more like alchemy" than engineering [6].

Huber, who serves as chief executive officer, had previously co-founded Standard Cyborg in 2014, a Y Combinator Winter 2015 company that built a 3D scanning and dense-reconstruction pipeline for the depth sensors in Apple's iPhones. Standard Cyborg's software was used in custom prosthetics, footwear sizing, and consumer applications using the TrueDepth camera that powers FaceID. Huber studied at North Carolina State University and was born with fibular hemimelia, an experience that shaped his early interest in prosthetics and 3D scanning [7].

Troynikov, who originally served as chief technology officer, came from a robotics background. Earlier in his career he worked on perception, computer vision, and large-scale 3D mapping at companies including Nuro, the autonomous-delivery startup, and at Meta's Facebook Reality Labs, where he was a research engineer focused on AR and VR systems. He studied at the Technical University of Munich and has written extensively about latent-space representations and retrieval over the course of building Chroma. Troynikov stepped back from a full-time role in 2024 and now serves as an advisor to the company while running American Terawatt, a venture focused on industrial electric grid infrastructure [8][9].

In the time since the founding team transitioned, Hammad Bashir, a founding engineer who joined Chroma early, has assumed the chief technology officer role. Bashir holds a BS in EECS from the University of California, Berkeley and previously held engineering positions at Snap, Tome, and Ubiquity6 (acquired by Discord), as well as Uber. He leads a small engineering team that includes Ben Eggers, Liquan Pei, and Weili Gu, all working on Chroma's distributed retrieval stack [10].

founding thesis

The founders built Chroma around a specific thesis: that existing vector databases were too complex for the typical AI developer who just needed to store embeddings and retrieve them quickly. While competitors like Pinecone, Weaviate, and Milvus targeted enterprise-scale deployments with distributed architectures, Chroma targeted the developer who was building a prototype on a laptop and wanted something that worked immediately without configuration [11].

In December 2022, Huber spent several weeks reaching out to developers who were posting on Twitter about LangChain and embeddings, conducting what he later described as an unintentional product-discovery exercise with a small group of influential early adopters. The first public release of the open-source library landed on GitHub on October 22, 2022, and a more polished launch followed on Valentine's Day in February 2023 (delayed by one day due to last-minute bug fixes) [6][12].

This approach resonated with the AI developer community. By early 2023 the project had been downloaded more than 35,000 times in its first two months, and within a year monthly Python downloads were measured in the millions. By 2026, the project had accumulated more than 27,900 GitHub stars, a Discord community with more than 10,000 members, and combined PyPI and npm downloads in excess of 15 million per month [1][3][13].

funding

Chroma Inc. has raised approximately $20 million across two disclosed rounds, a small total relative to peers in the same category. The company has chosen to operate with a small headcount and a long runway, consistent with its emphasis on engineering quality over rapid commercial expansion [4].

Round	Date	Amount	Lead investors	Notable participants
Pre-seed	May 2022	Approximately $2.3M	AIX Ventures (Anthony Goldbloom, Kaggle), Bloomberg Beta (James Cham), AI Grant (Nat Friedman, Daniel Gross)	Multiple angels
Seed	April 2023	$18M	Quiet Capital (Astasia Myers)	Naval Ravikant; Max and Jack Altman; Jordan Tigani (MotherDuck); Guillermo Rauch (Vercel); Akshay Kothari (Notion); Amjad Masad (Replit); Spencer Kimball (CockroachDB); founders from MongoDB, ScienceIO, Gumroad, Scale, Hugging Face, and Jasper

The seed round was announced publicly on April 6, 2023. Quiet Capital's Astasia Myers led the round, with the long list of operator-angels reflecting the fact that early enthusiasm for Chroma came largely from founders of other developer-tools and infrastructure companies. According to interviews from that period, the round was designed to fund work on a distributed open-source system, on relevance scoring, and eventually on a hosted serverless service [4][14].

Compared with competitors, Chroma's fundraising has been modest. Pinecone raised more than $138 million by mid-2023, including a $100M Series B at a $750M valuation. Weaviate raised a $50M Series B in April 2023 and total funding above $67M. Qdrant closed a $50M Series A in March 2026. As of May 2026, Chroma had not announced a Series A or Series B, although Huber has alluded in podcast appearances to extending the company's runway through commercial revenue rather than venture rounds [15][16].

design philosophy

Chroma's design philosophy centers on minimizing the distance between "I want to try vector search" and "I have vector search working." The project's tagline has evolved from "the AI-native open-source embedding database" to "the open-source data infrastructure for AI," reflecting its positioning as infrastructure purpose-built for AI workflows rather than a general-purpose database with vector capabilities bolted on.

Three design principles guide the project:

Simplicity over configurability. Chroma uses sensible defaults and requires minimal configuration. There are no index types to choose, no cluster topologies to design, and no sharding strategies to decide on for the common case. Huber describes the goal as "zero config, zero knobs to tune" [6].
Local-first development. Chroma runs in-process by default. A developer can import the library, create a collection, add documents, and query them, all without starting a separate server process.
Progressive complexity. As needs grow, developers can move from in-memory to persistent storage, from embedded to client-server mode, and from self-hosted to Chroma Cloud, without rewriting application code.

Chroma calls its commercial approach "buyer-based open source." The principle, stated on the project's about page, is that any feature that an individual developer would find useful will remain open source forever; only features oriented toward operating Chroma at organizational scale (single sign-on, audit logs, fleet management, BYOC deployments) become part of the commercial offering [10].

key features

in-memory and persistent modes

Chroma supports two storage modes out of the box. The default in-memory mode stores all data in RAM and loses it when the process exits, which is useful for experimentation and testing. Persistent mode writes data to disk, using SQLite for metadata and a file-based vector index, allowing data to survive restarts [17].

Switching between modes requires changing a single line of code:

# In-memory (default)
import chromadb
client = chromadb.Client()

# Persistent
client = chromadb.PersistentClient(path="/path/to/data")

client types and deployment modes

Chroma supports four client types, each suited to different deployment scenarios:

Client type	Class	Description	Best for
Ephemeral	`chromadb.Client()`	In-memory, non-persistent	Testing and development
Persistent	`chromadb.PersistentClient()`	SQLite + local filesystem	Single-process applications, prototyping
HTTP	`chromadb.HttpClient()`	Connects to a remote Chroma server	Production multi-client access
Cloud	`chromadb.CloudClient()`	Connects to Chroma Cloud	Managed serverless production deployments

The HTTP client is recommended for self-hosted production deployments because it enables scalability, high availability, and multi-client access. The Cloud client, introduced alongside Chroma Cloud's general availability in August 2025, points to the managed serverless service and uses the same query and ingestion APIs as the local clients [5][18].

language clients

Chroma provides official client libraries for the two languages most commonly used in AI development, with several community-supported clients on top:

Client	Installation	Description
Python	`pip install chromadb`	Full-featured; can run embedded or as client
JavaScript / TypeScript	`npm install chromadb`	Full client for the Chroma server API
Ruby	`gem install chromadb`	Community-supported client
PHP	Composer package	Community-supported client
Java	Maven package	Community-supported client
Go	`chromem-go` and other packages	Community-supported clients

The Python client is by far the most popular and supports both embedded mode (running the database in the same process as the application) and client mode (connecting to a separate Chroma server over HTTP). The JavaScript client connects to a running Chroma server and is commonly used in Node.js and browser-based applications. The Chroma 1.0 announcement in April 2025 stated that the Rust core would eventually power native bindings for JavaScript, Ruby, and Swift, plus a WebAssembly build for browser-only deployments [17][19].

embedding functions

Chroma can automatically generate embeddings for documents at insert time using a configured embedding function. By default, it uses the all-MiniLM-L6-v2 model from Sentence Transformers, which runs locally on the user's machine. Users can plug in alternative embedding providers without changing application logic [17].

Chroma supports the following embedding function providers:

Provider	Python	TypeScript	Key notes
Sentence Transformers	Yes	Yes	Default provider; runs locally; uses `all-MiniLM-L6-v2`
OpenAI	Yes	Yes	Uses OpenAI's embedding API
Cohere	Yes	Yes	Includes multilingual embedding models
Google Generative AI	Yes	Yes	Uses Google's embedding models
Hugging Face	Yes	No	Uses Hugging Face's hosted inference API
Hugging Face embedding server	Yes	Yes	Connects to a self-hosted Hugging Face embedding server
Jina AI	Yes	Yes	Jina AI's embedding models
Mistral	Yes	Yes	Mistral AI's embedding models
Together AI	Yes	Yes	Uses Together AI's hosted models
Cloudflare Workers AI	Yes	Yes	Runs on Cloudflare's edge infrastructure
Voyage AI	Yes	Yes	Voyage AI retrieval-tuned models
Morph	Yes	Yes	Morph AI's embedding service

Developers can also create custom embedding functions by implementing the EmbeddingFunction interface. For TypeScript, individual npm packages are available per provider (for example @chroma-core/openai), plus an @chroma-core/all package that installs all providers [20].

collection management

Chroma organizes data into collections, which are analogous to tables in a relational database. Each collection has its own embedding function and distance metric. Collections can be created, listed, deleted, and forked through the API. This simple organizational model avoids the complexity of indexes, namespaces, and shards found in more enterprise-oriented vector databases.

Collection operations include:

# Create or get a collection
collection = client.get_or_create_collection(
    name="my_docs",
    embedding_function=openai_ef,
    metadata={"hnsw:space": "cosine"}
)

# Add documents
collection.add(
    documents=["doc text 1", "doc text 2"],
    metadatas=[{"source": "wiki"}, {"source": "blog"}],
    ids=["id1", "id2"]
)

# Query
results = collection.query(
    query_texts=["search query"],
    n_results=5
)

# Update documents
collection.update(
    ids=["id1"],
    documents=["updated doc text"]
)

# Delete by ID or metadata filter
collection.delete(ids=["id2"])

# Count documents
count = collection.count()

# List all collections
collections = client.list_collections()

Collections do not require a predefined schema, allowing developers to start storing data immediately. Metadata keys and types can vary across documents within the same collection.

metadata filtering

Every document stored in Chroma can carry arbitrary key-value metadata. At query time, users can apply filters on this metadata alongside the similarity search. For example, a query might search for the most semantically similar code snippets but restrict results to a particular programming language or repository [17].

results = collection.query(
    query_texts=["How do I sort a list?"],
    n_results=5,
    where={"language": "python"}
)

Chroma supports $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin operators for metadata filtering, along with $and and $or for combining conditions.

Additionally, Chroma supports document-level filtering with the where_document parameter, which allows filtering by the content of the document itself using $contains and $not_contains operators:

results = collection.query(
    query_texts=["sorting algorithms"],
    n_results=5,
    where={"language": "python"},
    where_document={"$contains": "quicksort"}
)

sparse vectors and full-text search

In October 2025 Chroma added first-class support for sparse vector search, including BM25 and SPLADE representations. Combined with the existing dense vector index, this gives Chroma collections hybrid search capabilities comparable to those long offered by Weaviate and Pinecone [21].

Chroma Cloud also exposes trigram-based full-text search and regular-expression search, which the company has highlighted as features oriented toward retrieval for code-search agents and other tasks where developers want substring or pattern matching alongside semantic similarity [22].

collection forking

In 2025, Chroma Cloud introduced collection forking, a copy-on-write feature that allows a collection of any size to be branched into a new collection almost instantly. New writes to either branch allocate new storage blocks while unchanged data remains shared, so users only pay for the incremental data they actually change. The primary use cases are dataset versioning, A/B experimentation on retrieval quality, and incremental re-indexing during a data refresh [23].

Chroma supports multi-modal embeddings through integration with models like OpenCLIP, which embed both text and images into a shared semantic space. This enables applications to perform unified searches across different data types within the same collection.

For example, a collection using an OpenCLIP embedding function can accept both text documents and image URIs. A text query would return relevant results regardless of whether they are text documents or images, because both are represented in the same vector space. This capability is useful for applications like visual search, content moderation, and multi-modal RAG systems where context may include both textual and visual information [20].

persistence and storage architecture

Chroma's local storage architecture consists of two main components:

Metadata store (SQLite). All document metadata, collection configurations, and document IDs are stored in a SQLite database. For persistent mode, this is written to a chroma.sqlite3 file in the configured data directory.
Vector index (HNSW). Vector embeddings are stored in a custom Hierarchical Navigable Small World (HNSW) implementation. The HNSW index provides approximate nearest neighbor search with configurable trade-offs between speed and recall.

The directory structure for persistent storage is organized as:

/path/to/data/
  chroma.sqlite3          # Metadata store
  <collection-uuid>/      # One directory per collection
    data_level0.bin       # HNSW graph data
    header.bin            # Index header
    length.bin            # Vector lengths

For Chroma Cloud, data is stored in object storage with automatic tiering and caching. The cloud storage system implements a multi-tier caching architecture with admission control to minimize object storage API costs and query latency [24].

HNSW configuration

Chroma exposes several HNSW parameters through collection metadata:

Parameter	Default	Description
`hnsw:space`	`l2`	Distance function (l2, cosine, ip)
`hnsw:construction_ef`	100	Size of candidate list during index construction
`hnsw:search_ef`	10	Size of candidate list during search
`hnsw:M`	16	Maximum number of connections per node
`hnsw:num_threads`	4	Number of threads for index operations
`hnsw:resize_factor`	1.2	Factor by which the index grows when capacity is exceeded

distance functions

Chroma supports three distance metrics for similarity search, configured per collection at creation time:

Metric	Configuration value	Formula	Range	Use case
Squared L2 (Euclidean)	`l2`	Sum of squared differences	[0, infinity)	Default; general-purpose distance
Cosine distance	`cosine`	1 - cosine similarity	[0, 2]	Normalized embeddings; most text embedding models
Inner product	`ip`	Negative dot product	(-infinity, infinity)	When vector magnitude carries meaning

The distance metric is set via collection metadata at creation time:

collection = client.create_collection(
    name="my_collection",
    metadata={"hnsw:space": "cosine"}  # or "l2" or "ip"
)

For most text embedding models (OpenAI, Cohere, Sentence Transformers), cosine distance is the recommended choice because these models produce normalized embeddings where the angle between vectors is more meaningful than the raw distance [17].

distributed architecture and wal3

The distributed deployment of Chroma, which underpins Chroma Cloud, separates storage and compute through object storage as a shared layer. The architecture has three main service tiers [24][25]:

Gateways handle cluster routing and query orchestration.
Query nodes serve indices read from object storage, with SSD and in-memory caching.
Compactor nodes asynchronously build and persist indices to object storage.

A distributed write-ahead log called wal3 provides durability for incoming writes before compaction. wal3 was introduced in 2025 and uses Amazon S3's conditional write feature, which became generally available in November 2024, to implement a lock-free append protocol. Each collection has an independent log that is implemented as a linked list of file fragments backed by S3 manifest files; a content checksum called a setsum allows the implementation to verify log integrity in constant time during append and trim operations. The system was written in Rust and is open source under the same Apache 2.0 license as the rest of the project [25].

Chroma's engineering team has argued that object storage is roughly an order of magnitude cheaper than a comparable replicated SSD configuration while still offering more than 1 GB/s of parallel throughput. The trade-off is higher per-request latency, in the range of 35 to 100 milliseconds, which they argue is tolerable for retrieval workloads when combined with aggressive caching [24].

popular for prototyping RAG applications

Chroma has become the default vector database for developers building their first retrieval-augmented generation applications. Several factors contribute to this:

First, the setup friction is minimal. A developer can go from zero to a working RAG prototype in under 20 lines of Python. There is no server to start, no API key to obtain, and no cloud account to create.

Second, Chroma integrates with every major AI framework.

Framework	Integration type
LangChain	Native vector store (`Chroma` class)
LlamaIndex	Vector store connector
Semantic Kernel	Memory store provider
Haystack	Document store
CrewAI	Knowledge base
DSPy	Retrieval module
AutoGen	Memory provider
LangGraph	Persistent memory

Third, the in-memory mode means there is no operational overhead during development. Developers can iterate on their embedding strategy, chunking approach, and retrieval parameters without waiting for data to be uploaded to a remote service.

The trade-off is that Chroma's simple architecture can become a limitation at production scale. For applications handling hundreds of millions of vectors, distributed databases like Milvus, Weaviate, or Pinecone are better suited. Many teams follow a path of prototyping with Chroma, then migrating to a more scalable solution for production, or now to Chroma Cloud once they outgrow the embedded process model [11].

deployment patterns

Chroma supports three primary deployment patterns, each suited to different scale and operational requirements:

Pattern	Description	Scale	Recommended for
Local / embedded	Library runs in the application process	Small; under 1M records	Prototyping, testing, single-user applications
Single-node server	Standalone Chroma server accessed via HTTP	Medium; up to ~10M records	Small production workloads, multi-client access
Distributed	Multi-service deployment on Kubernetes or via Chroma Cloud	Large; millions of collections	Large-scale production with high availability

All three patterns use the same core codebase and expose the same API, so code written for the embedded pattern works without modification against a remote server, a Kubernetes deployment, or Chroma Cloud. This is a direct result of Chroma's progressive complexity philosophy [24].

For Docker deployments, Chroma provides an official Docker image:

docker run -p 8000:8000 chromadb/chroma

For Kubernetes deployments, the community maintains Helm charts, and Chroma's distributed mode splits the system into multiple services that can be scaled independently.

Chroma Cloud

Chroma Cloud is the company's hosted service. Following an extended private alpha during 2024, the service entered public beta in early 2025 and reached general availability in August 2025 alongside the v1.4.x line of releases. It provides serverless vector, hybrid, and full-text search as a managed service [5][26].

Feature	Description
Setup time	Under 30 seconds to create a database
Free credits	$5 of free credits for new users; an additional $100 in credits has been offered to new accounts during launch promotions
Storage	Object storage with automatic data tiering and caching
Replication	Multi-region replication
Management	Zero-ops; no infrastructure to manage
Security	SOC 2 Type II certification; customer-managed encryption keys (January 2026)
Recall in production	Greater than 90 percent recall reported in published benchmarks
Search APIs	Vector, sparse vector, full-text (trigram), regex, and metadata filtering
Forking	Copy-on-write collection forking for dataset versioning

Chroma Cloud runs in two regions as of May 2026: AWS US East (N. Virginia, aws-us-east-1) and Google Cloud Europe West 1 (Belgium, gcp-europe-west1). Each database stays entirely within the region the customer chooses. Some surface-level features such as Chroma Sync, the CLI tools, and the Search Agent are available only in the US region as of mid-2026 [18].

In February 2026, Chroma added a Bring Your Own Cloud (BYOC) option that allows enterprise customers to run Chroma's distributed deployment inside their own AWS or GCP account while still being managed by Chroma's control plane. The pricing model for the multi-tenant service is consumption-based, with storage charged at roughly $0.02 per gigabyte per month on object storage and writes and queries metered per operation [27].

Chroma Sync and Web Sync

Chroma also offers two managed ingestion products. Chroma Sync, launched in October 2025, automatically chunks, embeds, and indexes the contents of GitHub repositories, with re-syncs accelerated by collection forking so that only changed files are re-processed. Chroma Web Sync, launched in December 2025, performs a similar pipeline for arbitrary web URLs, including crawling, scraping, chunking, and embedding [21].

performance characteristics

Chroma's performance profile reflects its design priorities. For small to medium datasets (under 10 million vectors), Chroma delivers competitive query latency, especially when running in embedded mode where there is no network overhead.

The Chroma 1.0 release on April 3, 2025, was a significant performance milestone. By reimplementing the core data path in Rust and exposing it to Python through PyO3 bindings, the project eliminated Python's Global Interpreter Lock as a bottleneck for write and query throughput. Chroma's own benchmarks reported three to five times faster writes and three to five times faster queries compared with the previous Python core, leading the company to advertise the release as "4x faster" overall [19].

In August 2025, an optimization that switched on-the-wire vector representations to base64 encoding achieved a 70 percent increase in data throughput, further improving performance for bulk data operations [3].

For latency-sensitive applications, the embedded persistent client offers the lowest latency since it avoids network round trips entirely. The HTTP client adds network latency but enables multi-client access and horizontal scaling. Independent benchmarking through 2025 and 2026 has consistently shown that Chroma performs well at moderate scale (around one to ten million vectors) but loses ground to Pinecone and Weaviate beyond that range; Chroma Cloud has narrowed the gap, with the company publishing 20 ms p50 latency numbers for smaller workloads [28].

version history

Version	Date	Notes
Initial public release	October 22, 2022	Open-source Python library; in-memory only
0.3.x	February 2023	Polished launch; persistent SQLite metadata store
0.4.x	July 19, 2023	Improved API; HTTP client mode
0.5.x	2024	Distributed Chroma preview, multi-modal support, server stabilization
1.0.0 (pre-release)	April 3, 2025	Rust core; 3-5x writes and queries; full API compatibility
1.0.x	mid-2025	Stabilization of Rust core; PyO3 Python bindings
1.4.x	August 2025	Chroma Cloud general availability; hybrid search
1.5.x	March-May 2026	Latest stable line; latest tag 1.5.9 (May 5, 2026)

Releases follow a weekly cadence on Mondays, with the latest stable tag at the time of writing being 1.5.9 from May 2026 [3].

context engineering and Chroma research

In 2025 Jeff Huber became increasingly vocal about a concept he calls "context engineering" as distinct from prompt engineering. In his framing, the quality of large language model outputs depends less on how prompts are worded and more on what context (retrieved documents, user history, structured data) is provided to the model. Chroma's product roadmap reflects this perspective: features like Sync, Web Sync, and forking are designed to make it easier to build and maintain the context that gets fed to language models, rather than just storing and retrieving raw vectors [29].

The company has also published a series of research notes under the Chroma Research banner. Recurring themes include:

Report	Date	Topic
Evaluating chunking strategies for retrieval	July 2024	Compared chunking heuristics on retrieval quality
Embedding adapters	May 2024	Lightweight task-specific adaptations of off-the-shelf embedding models
Generative benchmarking	April 2025	Synthesizing evaluation pairs for retrieval with LLMs
Context Rot	July 2025	Showed that 18 frontier models all lose accuracy as input length grows
Context-1 technical report	March 2026	Followup framework for context engineering

The Context Rot report, authored by Kelly Hong, Anton Troynikov, and Jeff Huber, drew significant attention in summer 2025. It tested 18 frontier models, including GPT-4.1, Claude Opus 4, and Gemini 2.5, and reported that every one of them showed some performance degradation as input context length increased, even on tasks that were trivial at small lengths. The report's authors used the result to argue against "just stuff everything in the context window" approaches and in favor of investing in retrieval and ranking quality [30].

This framing positions Chroma not just as a vector database but as context infrastructure for AI applications: a layer that sits between raw data sources and language models, responsible for ensuring the model has access to the right information at inference time.

comparison with alternatives

Chroma occupies a distinct position in the vector database market, optimized for developer experience and rapid prototyping rather than enterprise scale, although Chroma Cloud and the distributed deployment options have begun to bridge that gap.

Feature	Chroma	Pinecone	Weaviate	Qdrant	Milvus	FAISS
License	Apache 2.0	Proprietary	BSD-3-Clause	Apache 2.0	Apache 2.0	MIT
Primary language	Rust / Python	Closed-source service	Go	Rust	Go / C++	C++
Setup complexity	Minimal (`pip install`)	Low (cloud API)	Moderate (Docker / K8s)	Moderate (Docker / K8s)	Moderate to high	Library only
In-memory mode	Yes	No	No	No	No	Yes
Hybrid search	Yes (since October 2025)	Yes (sparse-dense)	Yes (BM25 + vector)	Yes	Yes	No
Managed cloud	Chroma Cloud	Pinecone (only option)	Weaviate Cloud	Qdrant Cloud	Zilliz Cloud	None
Best for	Prototyping; small to medium production	Enterprise managed service	Open source at scale	High-performance open source	Very large scale	Library-level use
GitHub stars	27,900+	n/a (closed source)	14,000+	22,000+	32,000+	33,000+
Multi-modal support	Yes (OpenCLIP)	No native	Yes (CLIP, ImageBind modules)	Yes (multi-vector)	Yes	No

The main advantage Chroma holds over its competitors is the speed of getting started. No other vector database matches the simplicity of pip install chromadb followed by a few lines of Python to have a working vector store. The main historical limitation has been scalability: Chroma's single-process architecture meant it could not distribute data across multiple nodes the way Milvus, Weaviate, or Qdrant can, although the distributed deployment pattern and Chroma Cloud have begun to address this gap [11][28].

For developers and teams choosing between these options, the decision often comes down to where they are in the development lifecycle. Chroma is the most common choice for exploration, prototyping, and applications with modest data volumes. As the dataset grows or production requirements become more demanding, teams typically evaluate Pinecone (for managed simplicity), Weaviate (for open-source flexibility with hybrid search), Qdrant (for raw performance), or Milvus (for very large-scale deployments). FAISS, Meta's open-source library, is more often used as a low-level component than as a standalone database; Chroma itself shipped early versions that wrapped FAISS for the index before moving to a custom HNSW implementation in Rust.

ecosystem and notable use cases

Chroma's GitHub statistics offer one view of its reach. As of May 2026 the project had more than 27,900 stars, more than 2,200 forks, and was a dependency in more than 90,000 open-source codebases. Combined PyPI and npm downloads exceeded 15 million per month, and Chroma had served more than 60 million cumulative downloads since its initial release [3][6].

Chroma is the default or primary vector store recommended in many introductory RAG tutorials, including those published by LangChain, LlamaIndex, DataCamp, DeepLearning.AI's short course materials, and the official documentation of multiple LLM providers. The langchain-chroma package, the LangChain community's most popular vector store integration, sees more than two million monthly downloads in its own right.

The Chroma team has begun publishing customer case studies highlighting commercial usage. Examples include Mintlify, which uses Chroma to power retrieval over its customers' developer documentation, and Propel AI, which uses it as the retrieval layer for code-review agents. Other commercial users referenced on the company's website include Capital One and UnitedHealthcare, although the depth of integration in those organizations has not been publicly disclosed [1].

The company also maintains the Chroma Model Context Protocol (MCP) server, an implementation of the MCP standard that exposes Chroma collections as a tool that an LLM agent can read from and write to directly. This positions Chroma as a memory and retrieval backend for agents built on Anthropic's Claude tools, OpenAI's Responses API, and other agent frameworks that consume MCP servers.

Package Search MCP

In 2026 Chroma introduced Package Search MCP, a hosted MCP server backed by Chroma Cloud that indexes the source code of popular open-source packages from package registries such as PyPI and npm. Coding agents can query the server to retrieve the actual source of a function or symbol they are using, rather than relying on training-time memorization. The product targets a specific failure mode in code-generation agents, where models confidently invent APIs that do not exist or have changed between versions; by giving the agent a retrieval call to the real source, the agent can ground its output in current code. This use case is also a strong fit for the company's broader "context engineering" thesis [21].

typical RAG pipeline with Chroma

A simple RAG pipeline using Chroma and an LLM provider typically looks like the following in Python:

import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
from openai import OpenAI

client = chromadb.PersistentClient(path="./chroma_db")
openai_ef = OpenAIEmbeddingFunction(
    api_key="sk-...",
    model_name="text-embedding-3-small",
)

collection = client.get_or_create_collection(
    name="docs",
    embedding_function=openai_ef,
    metadata={"hnsw:space": "cosine"},
)

# 1. Ingest
collection.add(
    ids=["d1", "d2", "d3"],
    documents=[
        "Chroma is an open source embedding database.",
        "Pinecone is a managed vector database service.",
        "Weaviate is an open source vector database.",
    ],
    metadatas=[
        {"source": "wiki", "vendor": "chroma"},
        {"source": "wiki", "vendor": "pinecone"},
        {"source": "wiki", "vendor": "weaviate"},
    ],
)

# 2. Retrieve
results = collection.query(
    query_texts=["What is Chroma?"],
    n_results=2,
    where={"source": "wiki"},
)

context = "\n".join(results["documents"][0])

# 3. Generate
llm = OpenAI(api_key="sk-...")
response = llm.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Answer using the provided context."},
        {"role": "user", "content": f"Context:\n{context}\n\nQuestion: What is Chroma?"},
    ],
)

print(response.choices[0].message.content)

This pattern shows up in some form in nearly every Chroma tutorial: a PersistentClient for storage, a configured embedding function, a single add call for ingestion, and a query call that combines vector search with metadata filtering before the retrieved documents are passed to a model. Switching this same code from local prototyping to Chroma Cloud requires changing only the client constructor.

organization and culture

The company behind Chroma, Chroma Inc., is headquartered in San Francisco's Mission District. Public estimates put headcount in the range of 20 to 30 full-time employees as of mid-2026, considerably smaller than competitors such as Pinecone and Weaviate, both of which have raised significantly more capital. Job listings on the company's site advertise roles in Rust systems engineering, distributed systems, applied research on retrieval, developer relations, and product design, with most positions concentrated in San Francisco [10].

In interviews, Huber has repeatedly emphasized a culture of slow, deliberate hiring focused on people who care about engineering craft. He has cited deterministic testing, TLA+ modeling for distributed protocols, and consensus algorithms as areas where he wants engineers with deep skill, reflecting a level of operational rigor not always associated with early-stage AI infrastructure startups. Chroma also operates the SF Systems Group reading group, an informal community for engineers interested in systems papers and distributed systems theory.

On the open-source governance side, the project is hosted under the chroma-core GitHub organization. Pull requests are reviewed by the core team, and a public RFC process is used for larger architectural changes. Releases are cut on Mondays, and the project follows semantic versioning, with v1.x guaranteeing API stability for the Python and JavaScript clients [3].

current state (2025-2026)

By May 2026, Chroma has expanded well beyond its origins as a lightweight embedding store. The addition of sparse vector search, cloud hosting, data sync features, customer-managed encryption, BYOC deployments, and a Rust-powered distributed core shows a trajectory toward a more complete data platform for AI applications.

The project maintains its developer-first identity. The chromadb PyPI package continues to see millions of monthly downloads, and Chroma remains the default vector database in many AI tutorials, courses, and quickstart guides. Its presence in the LangChain and LlamaIndex ecosystems means new AI developers often encounter Chroma before any other vector database [1].

The competitive challenge for Chroma is bridging the gap between prototyping tool and production infrastructure. Chroma Cloud is the company's answer to this challenge, but it faces competition from well-funded managed services (Pinecone) and mature open-source projects (Weaviate, Qdrant, Milvus) that already have production-grade distributed architectures. Whether Chroma can retain its users as they move from prototype to production will likely determine the company's long-term trajectory. As of mid-2026 Chroma had not raised a Series A, with Huber suggesting in podcast interviews that the company would prefer to extend its runway through commercial revenue from Chroma Cloud rather than through additional venture capital [6][16].

references

Chroma - the open-source data infrastructure for AI - Chroma
ChromaDB - Wikipedia - Wikipedia
chroma-core/chroma on GitHub - GitHub
Chroma raises $18M seed round - Chroma
Show HN: Chroma Cloud - serverless search database for AI - Hacker News
"RAG is Dead, Context Engineering is King" with Jeff Huber of Chroma - Latent Space Podcast
Interview: Jeff Huber, Co-Founder and CEO of Standard Cyborg - 3DHeals
Anton Troynikov personal site - Troynikov.io
Anton Troynikov - American Terawatt LinkedIn profile - LinkedIn
About Chroma - Chroma Documentation
Pinecone vs Weaviate vs Chroma: Complete Vector Database Comparison - Aloa
Chroma's Jeff Huber on Vector Databases and Getting AI into Production - Madrona Venture Group
Chroma (vector database) - Grokipedia
Chroma raises $18M for AI-Powered Database - SiliconANGLE
Pinecone drops $100M investment on $750M valuation - TechCrunch
Chroma 2026 Company Profile, Team, Funding & Competitors - Tracxn
Learn How to Use Chroma DB: A Step-by-Step Guide - DataCamp
Chroma Cloud Getting Started - Chroma Documentation
Chroma is now 4x faster (Chroma 1.0) - Chroma
Embedding Functions - Chroma Documentation
Chroma Updates - Chroma
Chroma Cloud feature documentation - Chroma Documentation
Collection Forking - Chroma Documentation
Architecture Overview - Chroma Documentation
wal3: A Write-Ahead Log for Chroma, Built on Object Storage - Chroma Engineering Blog
Retrieval powered by object storage - Chroma Engineering Blog
Pinecone Price Increase: Is Chroma Cloud the Best Alternative? - Max Rohde
Top 10 Vector Databases in 2026 - Medium
Enhancing AI Models with Chroma: A Conversation with CEO Jeff Huber - Microsoft Semantic Kernel Blog
Context Rot: How Increasing Input Tokens Impacts LLM Performance - Chroma Research
Weaviate Raises $50 Million Series B Funding - PR Newswire
Hammad Bashir - LinkedIn - LinkedIn

history and founding

founders

founding thesis

funding

design philosophy

key features

in-memory and persistent modes

client types and deployment modes

language clients

embedding functions

collection management

metadata filtering

sparse vectors and full-text search

collection forking

multi-modal support

persistence and storage architecture

HNSW configuration

distance functions

distributed architecture and wal3

popular for prototyping RAG applications

deployment patterns

Chroma Cloud

Chroma Sync and Web Sync

performance characteristics

version history

context engineering and Chroma research

comparison with alternatives

ecosystem and notable use cases

Package Search MCP

typical RAG pipeline with Chroma

organization and culture

current state (2025-2026)

see also

references

Improve this article

Related Articles

Weaviate

pgvector

Qdrant

Milvus

Open-source AI

MCP server

history and founding

founders

founding thesis

funding

design philosophy

key features

in-memory and persistent modes

client types and deployment modes

language clients

embedding functions

collection management

metadata filtering

sparse vectors and full-text search

collection forking

multi-modal support

persistence and storage architecture

HNSW configuration

distance functions

distributed architecture and wal3

popular for prototyping RAG applications

deployment patterns

Chroma Cloud

Chroma Sync and Web Sync

performance characteristics

version history

context engineering and Chroma research

comparison with alternatives

ecosystem and notable use cases

Package Search MCP

typical RAG pipeline with Chroma

organization and culture

current state (2025-2026)

see also

references

Related Articles

Weaviate

pgvector

Qdrant