Chroma
Last reviewed
May 9, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 6,345 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 9, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 6,345 words
Add missing citations, update stale details, or suggest a clearer explanation.
Chroma is an open-source embedding database designed for artificial intelligence applications. It stores vector embeddings alongside their associated documents and metadata, then allows querying by nearest-neighbor similarity rather than traditional substring matching. Chroma was founded in 2022 by Jeff Huber and Anton Troynikov in San Francisco, and has become one of the most popular vector databases for prototyping retrieval-augmented generation (RAG) applications, largely because of its emphasis on simplicity: a working instance can be set up with pip install chromadb and a few lines of Python [1].
The project is distributed under the Apache 2.0 License and is developed primarily in Rust, Python, TypeScript, and Go. By May 2026 the chroma-core/chroma repository on GitHub had passed 27,900 stars and 2,200 forks, and the package was being downloaded more than 15 million times per month across the Python and JavaScript ecosystems [2][3]. The company behind the project, Chroma Inc., has raised approximately $20 million across a pre-seed and seed round, and operates a managed service called Chroma Cloud which reached general availability in 2025 [4][5].
Jeff Huber and Anton Troynikov founded Chroma in 2022 and incorporated the company in San Francisco. The pair had each spent the prior decade applying machine learning to physical-world products and had independently arrived at the same observation: while machine learning demos were easy to build, taking anything with a model in it from prototype to production felt, in Huber's words, "more like alchemy" than engineering [6].
Huber, who serves as chief executive officer, had previously co-founded Standard Cyborg in 2014, a Y Combinator Winter 2015 company that built a 3D scanning and dense-reconstruction pipeline for the depth sensors in Apple's iPhones. Standard Cyborg's software was used in custom prosthetics, footwear sizing, and consumer applications using the TrueDepth camera that powers FaceID. Huber studied at North Carolina State University and was born with fibular hemimelia, an experience that shaped his early interest in prosthetics and 3D scanning [7].
Troynikov, who originally served as chief technology officer, came from a robotics background. Earlier in his career he worked on perception, computer vision, and large-scale 3D mapping at companies including Nuro, the autonomous-delivery startup, and at Meta's Facebook Reality Labs, where he was a research engineer focused on AR and VR systems. He studied at the Technical University of Munich and has written extensively about latent-space representations and retrieval over the course of building Chroma. Troynikov stepped back from a full-time role in 2024 and now serves as an advisor to the company while running American Terawatt, a venture focused on industrial electric grid infrastructure [8][9].
In the time since the founding team transitioned, Hammad Bashir, a founding engineer who joined Chroma early, has assumed the chief technology officer role. Bashir holds a BS in EECS from the University of California, Berkeley and previously held engineering positions at Snap, Tome, and Ubiquity6 (acquired by Discord), as well as Uber. He leads a small engineering team that includes Ben Eggers, Liquan Pei, and Weili Gu, all working on Chroma's distributed retrieval stack [10].
The founders built Chroma around a specific thesis: that existing vector databases were too complex for the typical AI developer who just needed to store embeddings and retrieve them quickly. While competitors like Pinecone, Weaviate, and Milvus targeted enterprise-scale deployments with distributed architectures, Chroma targeted the developer who was building a prototype on a laptop and wanted something that worked immediately without configuration [11].
In December 2022, Huber spent several weeks reaching out to developers who were posting on Twitter about LangChain and embeddings, conducting what he later described as an unintentional product-discovery exercise with a small group of influential early adopters. The first public release of the open-source library landed on GitHub on October 22, 2022, and a more polished launch followed on Valentine's Day in February 2023 (delayed by one day due to last-minute bug fixes) [6][12].
This approach resonated with the AI developer community. By early 2023 the project had been downloaded more than 35,000 times in its first two months, and within a year monthly Python downloads were measured in the millions. By 2026, the project had accumulated more than 27,900 GitHub stars, a Discord community with more than 10,000 members, and combined PyPI and npm downloads in excess of 15 million per month [1][3][13].
Chroma Inc. has raised approximately $20 million across two disclosed rounds, a small total relative to peers in the same category. The company has chosen to operate with a small headcount and a long runway, consistent with its emphasis on engineering quality over rapid commercial expansion [4].
| Round | Date | Amount | Lead investors | Notable participants |
|---|---|---|---|---|
| Pre-seed | May 2022 | Approximately $2.3M | AIX Ventures (Anthony Goldbloom, Kaggle), Bloomberg Beta (James Cham), AI Grant (Nat Friedman, Daniel Gross) | Multiple angels |
| Seed | April 2023 | $18M | Quiet Capital (Astasia Myers) | Naval Ravikant; Max and Jack Altman; Jordan Tigani (MotherDuck); Guillermo Rauch (Vercel); Akshay Kothari (Notion); Amjad Masad (Replit); Spencer Kimball (CockroachDB); founders from MongoDB, ScienceIO, Gumroad, Scale, Hugging Face, and Jasper |
The seed round was announced publicly on April 6, 2023. Quiet Capital's Astasia Myers led the round, with the long list of operator-angels reflecting the fact that early enthusiasm for Chroma came largely from founders of other developer-tools and infrastructure companies. According to interviews from that period, the round was designed to fund work on a distributed open-source system, on relevance scoring, and eventually on a hosted serverless service [4][14].
Compared with competitors, Chroma's fundraising has been modest. Pinecone raised more than $138 million by mid-2023, including a $100M Series B at a $750M valuation. Weaviate raised a $50M Series B in April 2023 and total funding above $67M. Qdrant closed a $50M Series A in March 2026. As of May 2026, Chroma had not announced a Series A or Series B, although Huber has alluded in podcast appearances to extending the company's runway through commercial revenue rather than venture rounds [15][16].
Chroma's design philosophy centers on minimizing the distance between "I want to try vector search" and "I have vector search working." The project's tagline has evolved from "the AI-native open-source embedding database" to "the open-source data infrastructure for AI," reflecting its positioning as infrastructure purpose-built for AI workflows rather than a general-purpose database with vector capabilities bolted on.
Three design principles guide the project:
Chroma calls its commercial approach "buyer-based open source." The principle, stated on the project's about page, is that any feature that an individual developer would find useful will remain open source forever; only features oriented toward operating Chroma at organizational scale (single sign-on, audit logs, fleet management, BYOC deployments) become part of the commercial offering [10].
Chroma supports two storage modes out of the box. The default in-memory mode stores all data in RAM and loses it when the process exits, which is useful for experimentation and testing. Persistent mode writes data to disk, using SQLite for metadata and a file-based vector index, allowing data to survive restarts [17].
Switching between modes requires changing a single line of code:
# In-memory (default)
import chromadb
client = chromadb.Client()
# Persistent
client = chromadb.PersistentClient(path="/path/to/data")
Chroma supports four client types, each suited to different deployment scenarios:
| Client type | Class | Description | Best for |
|---|---|---|---|
| Ephemeral | chromadb.Client() | In-memory, non-persistent | Testing and development |
| Persistent | chromadb.PersistentClient() | SQLite + local filesystem | Single-process applications, prototyping |
| HTTP | chromadb.HttpClient() | Connects to a remote Chroma server | Production multi-client access |
| Cloud | chromadb.CloudClient() | Connects to Chroma Cloud | Managed serverless production deployments |
The HTTP client is recommended for self-hosted production deployments because it enables scalability, high availability, and multi-client access. The Cloud client, introduced alongside Chroma Cloud's general availability in August 2025, points to the managed serverless service and uses the same query and ingestion APIs as the local clients [5][18].
Chroma provides official client libraries for the two languages most commonly used in AI development, with several community-supported clients on top:
| Client | Installation | Description |
|---|---|---|
| Python | pip install chromadb | Full-featured; can run embedded or as client |
| JavaScript / TypeScript | npm install chromadb | Full client for the Chroma server API |
| Ruby | gem install chromadb | Community-supported client |
| PHP | Composer package | Community-supported client |
| Java | Maven package | Community-supported client |
| Go | chromem-go and other packages | Community-supported clients |
The Python client is by far the most popular and supports both embedded mode (running the database in the same process as the application) and client mode (connecting to a separate Chroma server over HTTP). The JavaScript client connects to a running Chroma server and is commonly used in Node.js and browser-based applications. The Chroma 1.0 announcement in April 2025 stated that the Rust core would eventually power native bindings for JavaScript, Ruby, and Swift, plus a WebAssembly build for browser-only deployments [17][19].
Chroma can automatically generate embeddings for documents at insert time using a configured embedding function. By default, it uses the all-MiniLM-L6-v2 model from Sentence Transformers, which runs locally on the user's machine. Users can plug in alternative embedding providers without changing application logic [17].
Chroma supports the following embedding function providers:
| Provider | Python | TypeScript | Key notes |
|---|---|---|---|
| Sentence Transformers | Yes | Yes | Default provider; runs locally; uses all-MiniLM-L6-v2 |
| OpenAI | Yes | Yes | Uses OpenAI's embedding API |
| Cohere | Yes | Yes | Includes multilingual embedding models |
| Google Generative AI | Yes | Yes | Uses Google's embedding models |
| Hugging Face | Yes | No | Uses Hugging Face's hosted inference API |
| Hugging Face embedding server | Yes | Yes | Connects to a self-hosted Hugging Face embedding server |
| Jina AI | Yes | Yes | Jina AI's embedding models |
| Mistral | Yes | Yes | Mistral AI's embedding models |
| Together AI | Yes | Yes | Uses Together AI's hosted models |
| Cloudflare Workers AI | Yes | Yes | Runs on Cloudflare's edge infrastructure |
| Voyage AI | Yes | Yes | Voyage AI retrieval-tuned models |
| Morph | Yes | Yes | Morph AI's embedding service |
Developers can also create custom embedding functions by implementing the EmbeddingFunction interface. For TypeScript, individual npm packages are available per provider (for example @chroma-core/openai), plus an @chroma-core/all package that installs all providers [20].
Chroma organizes data into collections, which are analogous to tables in a relational database. Each collection has its own embedding function and distance metric. Collections can be created, listed, deleted, and forked through the API. This simple organizational model avoids the complexity of indexes, namespaces, and shards found in more enterprise-oriented vector databases.
Collection operations include:
# Create or get a collection
collection = client.get_or_create_collection(
name="my_docs",
embedding_function=openai_ef,
metadata={"hnsw:space": "cosine"}
)
# Add documents
collection.add(
documents=["doc text 1", "doc text 2"],
metadatas=[{"source": "wiki"}, {"source": "blog"}],
ids=["id1", "id2"]
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5
)
# Update documents
collection.update(
ids=["id1"],
documents=["updated doc text"]
)
# Delete by ID or metadata filter
collection.delete(ids=["id2"])
# Count documents
count = collection.count()
# List all collections
collections = client.list_collections()
Collections do not require a predefined schema, allowing developers to start storing data immediately. Metadata keys and types can vary across documents within the same collection.
Every document stored in Chroma can carry arbitrary key-value metadata. At query time, users can apply filters on this metadata alongside the similarity search. For example, a query might search for the most semantically similar code snippets but restrict results to a particular programming language or repository [17].
results = collection.query(
query_texts=["How do I sort a list?"],
n_results=5,
where={"language": "python"}
)
Chroma supports $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin operators for metadata filtering, along with $and and $or for combining conditions.
Additionally, Chroma supports document-level filtering with the where_document parameter, which allows filtering by the content of the document itself using $contains and $not_contains operators:
results = collection.query(
query_texts=["sorting algorithms"],
n_results=5,
where={"language": "python"},
where_document={"$contains": "quicksort"}
)
In October 2025 Chroma added first-class support for sparse vector search, including BM25 and SPLADE representations. Combined with the existing dense vector index, this gives Chroma collections hybrid search capabilities comparable to those long offered by Weaviate and Pinecone [21].
Chroma Cloud also exposes trigram-based full-text search and regular-expression search, which the company has highlighted as features oriented toward retrieval for code-search agents and other tasks where developers want substring or pattern matching alongside semantic similarity [22].
In 2025, Chroma Cloud introduced collection forking, a copy-on-write feature that allows a collection of any size to be branched into a new collection almost instantly. New writes to either branch allocate new storage blocks while unchanged data remains shared, so users only pay for the incremental data they actually change. The primary use cases are dataset versioning, A/B experimentation on retrieval quality, and incremental re-indexing during a data refresh [23].
Chroma supports multi-modal embeddings through integration with models like OpenCLIP, which embed both text and images into a shared semantic space. This enables applications to perform unified searches across different data types within the same collection.
For example, a collection using an OpenCLIP embedding function can accept both text documents and image URIs. A text query would return relevant results regardless of whether they are text documents or images, because both are represented in the same vector space. This capability is useful for applications like visual search, content moderation, and multi-modal RAG systems where context may include both textual and visual information [20].
Chroma's local storage architecture consists of two main components:
chroma.sqlite3 file in the configured data directory.The directory structure for persistent storage is organized as:
/path/to/data/
chroma.sqlite3 # Metadata store
<collection-uuid>/ # One directory per collection
data_level0.bin # HNSW graph data
header.bin # Index header
length.bin # Vector lengths
For Chroma Cloud, data is stored in object storage with automatic tiering and caching. The cloud storage system implements a multi-tier caching architecture with admission control to minimize object storage API costs and query latency [24].
Chroma exposes several HNSW parameters through collection metadata:
| Parameter | Default | Description |
|---|---|---|
hnsw:space | l2 | Distance function (l2, cosine, ip) |
hnsw:construction_ef | 100 | Size of candidate list during index construction |
hnsw:search_ef | 10 | Size of candidate list during search |
hnsw:M | 16 | Maximum number of connections per node |
hnsw:num_threads | 4 | Number of threads for index operations |
hnsw:resize_factor | 1.2 | Factor by which the index grows when capacity is exceeded |
Chroma supports three distance metrics for similarity search, configured per collection at creation time:
| Metric | Configuration value | Formula | Range | Use case |
|---|---|---|---|---|
| Squared L2 (Euclidean) | l2 | Sum of squared differences | [0, infinity) | Default; general-purpose distance |
| Cosine distance | cosine | 1 - cosine similarity | [0, 2] | Normalized embeddings; most text embedding models |
| Inner product | ip | Negative dot product | (-infinity, infinity) | When vector magnitude carries meaning |
The distance metric is set via collection metadata at creation time:
collection = client.create_collection(
name="my_collection",
metadata={"hnsw:space": "cosine"} # or "l2" or "ip"
)
For most text embedding models (OpenAI, Cohere, Sentence Transformers), cosine distance is the recommended choice because these models produce normalized embeddings where the angle between vectors is more meaningful than the raw distance [17].
The distributed deployment of Chroma, which underpins Chroma Cloud, separates storage and compute through object storage as a shared layer. The architecture has three main service tiers [24][25]:
A distributed write-ahead log called wal3 provides durability for incoming writes before compaction. wal3 was introduced in 2025 and uses Amazon S3's conditional write feature, which became generally available in November 2024, to implement a lock-free append protocol. Each collection has an independent log that is implemented as a linked list of file fragments backed by S3 manifest files; a content checksum called a setsum allows the implementation to verify log integrity in constant time during append and trim operations. The system was written in Rust and is open source under the same Apache 2.0 license as the rest of the project [25].
Chroma's engineering team has argued that object storage is roughly an order of magnitude cheaper than a comparable replicated SSD configuration while still offering more than 1 GB/s of parallel throughput. The trade-off is higher per-request latency, in the range of 35 to 100 milliseconds, which they argue is tolerable for retrieval workloads when combined with aggressive caching [24].
Chroma has become the default vector database for developers building their first retrieval-augmented generation applications. Several factors contribute to this:
First, the setup friction is minimal. A developer can go from zero to a working RAG prototype in under 20 lines of Python. There is no server to start, no API key to obtain, and no cloud account to create.
Second, Chroma integrates with every major AI framework.
| Framework | Integration type |
|---|---|
| LangChain | Native vector store (Chroma class) |
| LlamaIndex | Vector store connector |
| Semantic Kernel | Memory store provider |
| Haystack | Document store |
| CrewAI | Knowledge base |
| DSPy | Retrieval module |
| AutoGen | Memory provider |
| LangGraph | Persistent memory |
Third, the in-memory mode means there is no operational overhead during development. Developers can iterate on their embedding strategy, chunking approach, and retrieval parameters without waiting for data to be uploaded to a remote service.
The trade-off is that Chroma's simple architecture can become a limitation at production scale. For applications handling hundreds of millions of vectors, distributed databases like Milvus, Weaviate, or Pinecone are better suited. Many teams follow a path of prototyping with Chroma, then migrating to a more scalable solution for production, or now to Chroma Cloud once they outgrow the embedded process model [11].
Chroma supports three primary deployment patterns, each suited to different scale and operational requirements:
| Pattern | Description | Scale | Recommended for |
|---|---|---|---|
| Local / embedded | Library runs in the application process | Small; under 1M records | Prototyping, testing, single-user applications |
| Single-node server | Standalone Chroma server accessed via HTTP | Medium; up to ~10M records | Small production workloads, multi-client access |
| Distributed | Multi-service deployment on Kubernetes or via Chroma Cloud | Large; millions of collections | Large-scale production with high availability |
All three patterns use the same core codebase and expose the same API, so code written for the embedded pattern works without modification against a remote server, a Kubernetes deployment, or Chroma Cloud. This is a direct result of Chroma's progressive complexity philosophy [24].
For Docker deployments, Chroma provides an official Docker image:
docker run -p 8000:8000 chromadb/chroma
For Kubernetes deployments, the community maintains Helm charts, and Chroma's distributed mode splits the system into multiple services that can be scaled independently.
Chroma Cloud is the company's hosted service. Following an extended private alpha during 2024, the service entered public beta in early 2025 and reached general availability in August 2025 alongside the v1.4.x line of releases. It provides serverless vector, hybrid, and full-text search as a managed service [5][26].
| Feature | Description |
|---|---|
| Setup time | Under 30 seconds to create a database |
| Free credits | $5 of free credits for new users; an additional $100 in credits has been offered to new accounts during launch promotions |
| Storage | Object storage with automatic data tiering and caching |
| Replication | Multi-region replication |
| Management | Zero-ops; no infrastructure to manage |
| Security | SOC 2 Type II certification; customer-managed encryption keys (January 2026) |
| Recall in production | Greater than 90 percent recall reported in published benchmarks |
| Search APIs | Vector, sparse vector, full-text (trigram), regex, and metadata filtering |
| Forking | Copy-on-write collection forking for dataset versioning |
Chroma Cloud runs in two regions as of May 2026: AWS US East (N. Virginia, aws-us-east-1) and Google Cloud Europe West 1 (Belgium, gcp-europe-west1). Each database stays entirely within the region the customer chooses. Some surface-level features such as Chroma Sync, the CLI tools, and the Search Agent are available only in the US region as of mid-2026 [18].
In February 2026, Chroma added a Bring Your Own Cloud (BYOC) option that allows enterprise customers to run Chroma's distributed deployment inside their own AWS or GCP account while still being managed by Chroma's control plane. The pricing model for the multi-tenant service is consumption-based, with storage charged at roughly $0.02 per gigabyte per month on object storage and writes and queries metered per operation [27].
Chroma also offers two managed ingestion products. Chroma Sync, launched in October 2025, automatically chunks, embeds, and indexes the contents of GitHub repositories, with re-syncs accelerated by collection forking so that only changed files are re-processed. Chroma Web Sync, launched in December 2025, performs a similar pipeline for arbitrary web URLs, including crawling, scraping, chunking, and embedding [21].
Chroma's performance profile reflects its design priorities. For small to medium datasets (under 10 million vectors), Chroma delivers competitive query latency, especially when running in embedded mode where there is no network overhead.
The Chroma 1.0 release on April 3, 2025, was a significant performance milestone. By reimplementing the core data path in Rust and exposing it to Python through PyO3 bindings, the project eliminated Python's Global Interpreter Lock as a bottleneck for write and query throughput. Chroma's own benchmarks reported three to five times faster writes and three to five times faster queries compared with the previous Python core, leading the company to advertise the release as "4x faster" overall [19].
In August 2025, an optimization that switched on-the-wire vector representations to base64 encoding achieved a 70 percent increase in data throughput, further improving performance for bulk data operations [3].
For latency-sensitive applications, the embedded persistent client offers the lowest latency since it avoids network round trips entirely. The HTTP client adds network latency but enables multi-client access and horizontal scaling. Independent benchmarking through 2025 and 2026 has consistently shown that Chroma performs well at moderate scale (around one to ten million vectors) but loses ground to Pinecone and Weaviate beyond that range; Chroma Cloud has narrowed the gap, with the company publishing 20 ms p50 latency numbers for smaller workloads [28].
| Version | Date | Notes |
|---|---|---|
| Initial public release | October 22, 2022 | Open-source Python library; in-memory only |
| 0.3.x | February 2023 | Polished launch; persistent SQLite metadata store |
| 0.4.x | July 19, 2023 | Improved API; HTTP client mode |
| 0.5.x | 2024 | Distributed Chroma preview, multi-modal support, server stabilization |
| 1.0.0 (pre-release) | April 3, 2025 | Rust core; 3-5x writes and queries; full API compatibility |
| 1.0.x | mid-2025 | Stabilization of Rust core; PyO3 Python bindings |
| 1.4.x | August 2025 | Chroma Cloud general availability; hybrid search |
| 1.5.x | March-May 2026 | Latest stable line; latest tag 1.5.9 (May 5, 2026) |
Releases follow a weekly cadence on Mondays, with the latest stable tag at the time of writing being 1.5.9 from May 2026 [3].
In 2025 Jeff Huber became increasingly vocal about a concept he calls "context engineering" as distinct from prompt engineering. In his framing, the quality of large language model outputs depends less on how prompts are worded and more on what context (retrieved documents, user history, structured data) is provided to the model. Chroma's product roadmap reflects this perspective: features like Sync, Web Sync, and forking are designed to make it easier to build and maintain the context that gets fed to language models, rather than just storing and retrieving raw vectors [29].
The company has also published a series of research notes under the Chroma Research banner. Recurring themes include:
| Report | Date | Topic |
|---|---|---|
| Evaluating chunking strategies for retrieval | July 2024 | Compared chunking heuristics on retrieval quality |
| Embedding adapters | May 2024 | Lightweight task-specific adaptations of off-the-shelf embedding models |
| Generative benchmarking | April 2025 | Synthesizing evaluation pairs for retrieval with LLMs |
| Context Rot | July 2025 | Showed that 18 frontier models all lose accuracy as input length grows |
| Context-1 technical report | March 2026 | Followup framework for context engineering |
The Context Rot report, authored by Kelly Hong, Anton Troynikov, and Jeff Huber, drew significant attention in summer 2025. It tested 18 frontier models, including GPT-4.1, Claude Opus 4, and Gemini 2.5, and reported that every one of them showed some performance degradation as input context length increased, even on tasks that were trivial at small lengths. The report's authors used the result to argue against "just stuff everything in the context window" approaches and in favor of investing in retrieval and ranking quality [30].
This framing positions Chroma not just as a vector database but as context infrastructure for AI applications: a layer that sits between raw data sources and language models, responsible for ensuring the model has access to the right information at inference time.
Chroma occupies a distinct position in the vector database market, optimized for developer experience and rapid prototyping rather than enterprise scale, although Chroma Cloud and the distributed deployment options have begun to bridge that gap.
| Feature | Chroma | Pinecone | Weaviate | Qdrant | Milvus | FAISS |
|---|---|---|---|---|---|---|
| License | Apache 2.0 | Proprietary | BSD-3-Clause | Apache 2.0 | Apache 2.0 | MIT |
| Primary language | Rust / Python | Closed-source service | Go | Rust | Go / C++ | C++ |
| Setup complexity | Minimal (pip install) | Low (cloud API) | Moderate (Docker / K8s) | Moderate (Docker / K8s) | Moderate to high | Library only |
| In-memory mode | Yes | No | No | No | No | Yes |
| Hybrid search | Yes (since October 2025) | Yes (sparse-dense) | Yes (BM25 + vector) | Yes | Yes | No |
| Managed cloud | Chroma Cloud | Pinecone (only option) | Weaviate Cloud | Qdrant Cloud | Zilliz Cloud | None |
| Best for | Prototyping; small to medium production | Enterprise managed service | Open source at scale | High-performance open source | Very large scale | Library-level use |
| GitHub stars | 27,900+ | n/a (closed source) | 14,000+ | 22,000+ | 32,000+ | 33,000+ |
| Multi-modal support | Yes (OpenCLIP) | No native | Yes (CLIP, ImageBind modules) | Yes (multi-vector) | Yes | No |
The main advantage Chroma holds over its competitors is the speed of getting started. No other vector database matches the simplicity of pip install chromadb followed by a few lines of Python to have a working vector store. The main historical limitation has been scalability: Chroma's single-process architecture meant it could not distribute data across multiple nodes the way Milvus, Weaviate, or Qdrant can, although the distributed deployment pattern and Chroma Cloud have begun to address this gap [11][28].
For developers and teams choosing between these options, the decision often comes down to where they are in the development lifecycle. Chroma is the most common choice for exploration, prototyping, and applications with modest data volumes. As the dataset grows or production requirements become more demanding, teams typically evaluate Pinecone (for managed simplicity), Weaviate (for open-source flexibility with hybrid search), Qdrant (for raw performance), or Milvus (for very large-scale deployments). FAISS, Meta's open-source library, is more often used as a low-level component than as a standalone database; Chroma itself shipped early versions that wrapped FAISS for the index before moving to a custom HNSW implementation in Rust.
Chroma's GitHub statistics offer one view of its reach. As of May 2026 the project had more than 27,900 stars, more than 2,200 forks, and was a dependency in more than 90,000 open-source codebases. Combined PyPI and npm downloads exceeded 15 million per month, and Chroma had served more than 60 million cumulative downloads since its initial release [3][6].
Chroma is the default or primary vector store recommended in many introductory RAG tutorials, including those published by LangChain, LlamaIndex, DataCamp, DeepLearning.AI's short course materials, and the official documentation of multiple LLM providers. The langchain-chroma package, the LangChain community's most popular vector store integration, sees more than two million monthly downloads in its own right.
The Chroma team has begun publishing customer case studies highlighting commercial usage. Examples include Mintlify, which uses Chroma to power retrieval over its customers' developer documentation, and Propel AI, which uses it as the retrieval layer for code-review agents. Other commercial users referenced on the company's website include Capital One and UnitedHealthcare, although the depth of integration in those organizations has not been publicly disclosed [1].
The company also maintains the Chroma Model Context Protocol (MCP) server, an implementation of the MCP standard that exposes Chroma collections as a tool that an LLM agent can read from and write to directly. This positions Chroma as a memory and retrieval backend for agents built on Anthropic's Claude tools, OpenAI's Responses API, and other agent frameworks that consume MCP servers.
In 2026 Chroma introduced Package Search MCP, a hosted MCP server backed by Chroma Cloud that indexes the source code of popular open-source packages from package registries such as PyPI and npm. Coding agents can query the server to retrieve the actual source of a function or symbol they are using, rather than relying on training-time memorization. The product targets a specific failure mode in code-generation agents, where models confidently invent APIs that do not exist or have changed between versions; by giving the agent a retrieval call to the real source, the agent can ground its output in current code. This use case is also a strong fit for the company's broader "context engineering" thesis [21].
A simple RAG pipeline using Chroma and an LLM provider typically looks like the following in Python:
import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
from openai import OpenAI
client = chromadb.PersistentClient(path="./chroma_db")
openai_ef = OpenAIEmbeddingFunction(
api_key="sk-...",
model_name="text-embedding-3-small",
)
collection = client.get_or_create_collection(
name="docs",
embedding_function=openai_ef,
metadata={"hnsw:space": "cosine"},
)
# 1. Ingest
collection.add(
ids=["d1", "d2", "d3"],
documents=[
"Chroma is an open source embedding database.",
"Pinecone is a managed vector database service.",
"Weaviate is an open source vector database.",
],
metadatas=[
{"source": "wiki", "vendor": "chroma"},
{"source": "wiki", "vendor": "pinecone"},
{"source": "wiki", "vendor": "weaviate"},
],
)
# 2. Retrieve
results = collection.query(
query_texts=["What is Chroma?"],
n_results=2,
where={"source": "wiki"},
)
context = "\n".join(results["documents"][0])
# 3. Generate
llm = OpenAI(api_key="sk-...")
response = llm.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer using the provided context."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: What is Chroma?"},
],
)
print(response.choices[0].message.content)
This pattern shows up in some form in nearly every Chroma tutorial: a PersistentClient for storage, a configured embedding function, a single add call for ingestion, and a query call that combines vector search with metadata filtering before the retrieved documents are passed to a model. Switching this same code from local prototyping to Chroma Cloud requires changing only the client constructor.
The company behind Chroma, Chroma Inc., is headquartered in San Francisco's Mission District. Public estimates put headcount in the range of 20 to 30 full-time employees as of mid-2026, considerably smaller than competitors such as Pinecone and Weaviate, both of which have raised significantly more capital. Job listings on the company's site advertise roles in Rust systems engineering, distributed systems, applied research on retrieval, developer relations, and product design, with most positions concentrated in San Francisco [10].
In interviews, Huber has repeatedly emphasized a culture of slow, deliberate hiring focused on people who care about engineering craft. He has cited deterministic testing, TLA+ modeling for distributed protocols, and consensus algorithms as areas where he wants engineers with deep skill, reflecting a level of operational rigor not always associated with early-stage AI infrastructure startups. Chroma also operates the SF Systems Group reading group, an informal community for engineers interested in systems papers and distributed systems theory.
On the open-source governance side, the project is hosted under the chroma-core GitHub organization. Pull requests are reviewed by the core team, and a public RFC process is used for larger architectural changes. Releases are cut on Mondays, and the project follows semantic versioning, with v1.x guaranteeing API stability for the Python and JavaScript clients [3].
By May 2026, Chroma has expanded well beyond its origins as a lightweight embedding store. The addition of sparse vector search, cloud hosting, data sync features, customer-managed encryption, BYOC deployments, and a Rust-powered distributed core shows a trajectory toward a more complete data platform for AI applications.
The project maintains its developer-first identity. The chromadb PyPI package continues to see millions of monthly downloads, and Chroma remains the default vector database in many AI tutorials, courses, and quickstart guides. Its presence in the LangChain and LlamaIndex ecosystems means new AI developers often encounter Chroma before any other vector database [1].
The competitive challenge for Chroma is bridging the gap between prototyping tool and production infrastructure. Chroma Cloud is the company's answer to this challenge, but it faces competition from well-funded managed services (Pinecone) and mature open-source projects (Weaviate, Qdrant, Milvus) that already have production-grade distributed architectures. Whether Chroma can retain its users as they move from prototype to production will likely determine the company's long-term trajectory. As of mid-2026 Chroma had not raised a Series A, with Huber suggesting in podcast interviews that the company would prefer to extend its runway through commercial revenue from Chroma Cloud rather than through additional venture capital [6][16].