Chroma is an open-source embedding database designed for artificial intelligence applications. It stores vector embeddings alongside their associated documents and metadata, then allows querying by nearest-neighbor similarity rather than traditional substring matching. Chroma was founded in 2022 by Jeff Huber and Anton Troynikov in San Francisco, and has become one of the most popular vector databases for prototyping retrieval-augmented generation (RAG) applications, largely because of its emphasis on simplicity: a working instance can be set up with pip install chromadb and a few lines of Python [1].
Jeff Huber and Anton Troynikov founded Chroma in 2022. Huber, who serves as CEO, previously worked in applied AI and had experience building production machine learning systems. Troynikov, the CTO, brought expertise in systems architecture and distributed computing [2].
The founders built Chroma around a specific thesis: that existing vector databases were too complex for the typical AI developer who just needed to store embeddings and retrieve them quickly. While competitors like Pinecone, Weaviate, and Milvus targeted enterprise-scale deployments with distributed architectures, Chroma targeted the developer who was building a prototype on a laptop and wanted something that worked immediately without configuration [3].
This approach resonated with the AI developer community. By 2025, the project had accumulated over 25,000 GitHub stars, a Discord community with more than 10,000 members, and monthly PyPI downloads exceeding 5 million [1].
Chroma has raised approximately $18 million in seed funding [4].
| Round | Date | Amount | Lead Investor |
|---|---|---|---|
| Pre-seed | May 2022 | Undisclosed | Various angels |
| Seed | April 2023 | $18M | Quiet Capital (Astasia Myers) |
The seed round included angel investors Naval Ravikant, Max and Jack Altman, Jordan Tigani (Motherduck), Guillermo Rauch (Vercel), Akshay Kothari (Notion), Amjad Masad (Replit), and Spencer Kimball (CockroachDB), among others. The participation of founders from well-known developer tools companies reflected confidence in Chroma's developer-first approach [4].
Compared to competitors, Chroma's fundraising has been modest. Pinecone raised $138 million (including a $100M Series B at a $750M valuation), and Weaviate raised over $67 million. Chroma has chosen to operate with a smaller team and leaner burn rate, consistent with its focus on simplicity [5][6].
Chroma's design philosophy centers on minimizing the distance between "I want to try vector search" and "I have vector search working." The project's tagline, "the AI-native open-source embedding database," reflects its positioning as infrastructure purpose-built for AI workflows rather than a general-purpose database with vector capabilities bolted on.
Three design principles guide the project:
Simplicity over configurability. Chroma uses sensible defaults and requires minimal configuration. There are no index types to choose, no cluster topologies to design, and no sharding strategies to decide on for the common case.
Local-first development. Chroma runs in-process by default. A developer can import the library, create a collection, add documents, and query them, all without starting a separate server process.
Progressive complexity. As needs grow, developers can move from in-memory to persistent storage, from embedded to client-server mode, and from self-hosted to Chroma Cloud, without rewriting application code.
Chroma supports two storage modes out of the box. The default in-memory mode stores all data in RAM and loses it when the process exits, which is useful for experimentation and testing. Persistent mode writes data to disk (using SQLite for metadata and a file-based vector index), allowing data to survive restarts [7].
Switching between modes requires changing a single line of code:
# In-memory (default)
import chromadb
client = chromadb.Client()
# Persistent
client = chromadb.PersistentClient(path="/path/to/data")
Chroma supports four client types, each suited to different deployment scenarios:
| Client type | Class | Description | Best for |
|---|---|---|---|
| Ephemeral | chromadb.Client() | In-memory, non-persistent | Testing and development |
| Persistent | chromadb.PersistentClient() | SQLite + local filesystem | Single-process applications, prototyping |
| HTTP | chromadb.HttpClient() | Connects to a remote Chroma server | Production multi-client access |
| Rust | chromadb.RustClient() | Rust-based implementation via PyO3 bindings | Performance-critical workloads |
The HTTP client is recommended for production deployments because it enables scalability, high availability, and multi-client access. The Rust client, introduced in 2025 as part of Chroma's core rewrite, provides performance improvements by bypassing Python's Global Interpreter Lock [15].
Chroma provides official client libraries for the two languages most commonly used in AI development.
| Client | Installation | Description |
|---|---|---|
| Python | pip install chromadb | Full-featured, can run embedded or as client |
| JavaScript/TypeScript | npm install chromadb | Client for the Chroma server API |
| Ruby | gem install chromadb | Community-supported client |
| PHP | Composer package | Community-supported client |
| Java | Maven package | Community-supported client |
The Python client is by far the most popular and supports both embedded mode (running the database in the same process as the application) and client mode (connecting to a separate Chroma server over HTTP). The JavaScript client connects to a running Chroma server and is commonly used in Node.js and browser-based applications [7].
Chroma can automatically generate embeddings for documents at insert time using a configured embedding function. By default, it uses the all-MiniLM-L6-v2 model from Sentence Transformers, which runs locally on the user's machine. Users can plug in alternative embedding providers without changing application logic [7].
Chroma supports the following embedding function providers:
| Provider | Python | TypeScript | Key notes |
|---|---|---|---|
| Sentence Transformers | Yes | Yes | Default provider; runs locally; uses all-MiniLM-L6-v2 |
| OpenAI | Yes | Yes | Uses OpenAI's embedding API |
| Cohere | Yes | Yes | Includes multilingual embedding models |
| Google Generative AI | Yes | Yes | Uses Google's embedding models |
| Hugging Face | Yes | No | Uses Hugging Face's hosted inference API |
| Hugging Face Embedding Server | Yes | Yes | Connects to a self-hosted Hugging Face embedding server |
| Jina AI | Yes | Yes | Jina AI's embedding models |
| Mistral | Yes | Yes | Mistral AI's embedding models |
| Together AI | Yes | Yes | Uses Together AI's hosted models |
| Cloudflare Workers AI | Yes | Yes | Runs on Cloudflare's edge infrastructure |
| Morph | Yes | Yes | Morph AI's embedding service |
Developers can also create custom embedding functions by implementing the EmbeddingFunction interface. For TypeScript, individual npm packages are available per provider (e.g., @chroma-core/openai), plus an @chroma-core/all package that installs all providers [16].
Chroma organizes data into collections, which are analogous to tables in a relational database. Each collection has its own embedding function and distance metric. Collections can be created, listed, and deleted through the API. This simple organizational model avoids the complexity of indexes, namespaces, and shards found in more enterprise-oriented vector databases.
Collection operations include:
# Create or get a collection
collection = client.get_or_create_collection(
name="my_docs",
embedding_function=openai_ef,
metadata={"hnsw:space": "cosine"}
)
# Add documents
collection.add(
documents=["doc text 1", "doc text 2"],
metadatas=[{"source": "wiki"}, {"source": "blog"}],
ids=["id1", "id2"]
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5
)
# Update documents
collection.update(
ids=["id1"],
documents=["updated doc text"]
)
# Delete by ID or metadata filter
collection.delete(ids=["id2"])
# Count documents
count = collection.count()
# List all collections
collections = client.list_collections()
Collections do not require a predefined schema, allowing developers to start storing data immediately. Metadata keys and types can vary across documents within the same collection.
Every document stored in Chroma can carry arbitrary key-value metadata. At query time, users can apply filters on this metadata alongside the similarity search. For example, a query might search for the most semantically similar code snippets but restrict results to a particular programming language or repository [7].
results = collection.query(
query_texts=["How do I sort a list?"],
n_results=5,
where={"language": "python"}
)
Chroma supports $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin operators for metadata filtering, along with $and and $or for combining conditions.
Additionally, Chroma supports document-level filtering with the where_document parameter, which allows filtering by the content of the document itself using $contains and $not_contains operators:
results = collection.query(
query_texts=["sorting algorithms"],
n_results=5,
where={"language": "python"},
where_document={"$contains": "quicksort"}
)
Chroma supports three distance metrics for similarity search, configured per collection at creation time:
| Metric | Configuration value | Formula | Range | Use case |
|---|---|---|---|---|
| Squared L2 (Euclidean) | l2 | Sum of squared differences | [0, infinity) | Default; general-purpose distance |
| Cosine distance | cosine | 1 - cosine similarity | [0, 2] | Normalized embeddings; most text embedding models |
| Inner product | ip | Negative dot product | (-infinity, infinity) | When vector magnitude carries meaning |
The distance metric is set via collection metadata at creation time:
collection = client.create_collection(
name="my_collection",
metadata={"hnsw:space": "cosine"} # or "l2" or "ip"
)
For most text embedding models (OpenAI, Cohere, Sentence Transformers), cosine distance is the recommended choice because these models produce normalized embeddings where the angle between vectors is more meaningful than the raw distance [7].
Chroma supports multi-modal embeddings through integration with models like OpenCLIP, which embed both text and images into a shared semantic space. This enables applications to perform unified searches across different data types within the same collection.
For example, a collection using an OpenCLIP embedding function can accept both text documents and image URIs. A text query would return relevant results regardless of whether they are text documents or images, because both are represented in the same vector space. This capability is useful for applications like visual search, content moderation, and multi-modal RAG systems where context may include both textual and visual information [16].
Chroma's storage architecture consists of two main components:
chroma.sqlite3 file in the configured data directory.The directory structure for persistent storage is organized as:
/path/to/data/
chroma.sqlite3 # Metadata store
<collection-uuid>/ # One directory per collection
data_level0.bin # HNSW graph data
header.bin # Index header
length.bin # Vector lengths
For Chroma Cloud, data is stored in object storage with automatic tiering and caching. The cloud storage system implements a multi-tier caching architecture with admission control to minimize object storage API costs and query latency [15].
Chroma exposes several HNSW parameters through collection metadata:
| Parameter | Default | Description |
|---|---|---|
hnsw:space | l2 | Distance function (l2, cosine, ip) |
hnsw:construction_ef | 100 | Size of candidate list during index construction |
hnsw:search_ef | 10 | Size of candidate list during search |
hnsw:M | 16 | Maximum number of connections per node |
hnsw:num_threads | 4 | Number of threads for index operations |
hnsw:resize_factor | 1.2 | Factor by which the index grows when capacity is exceeded |
Chroma has become the go-to vector database for developers building their first RAG applications. Several factors contribute to this.
First, the setup friction is minimal. A developer can go from zero to a working RAG prototype in under 20 lines of Python. There is no server to start, no API key to obtain, and no cloud account to create.
Second, Chroma integrates with every major AI framework.
| Framework | Integration type |
|---|---|
| LangChain | Native vector store (Chroma class) |
| LlamaIndex | Vector store connector |
| Semantic Kernel | Memory store provider |
| Haystack | Document store |
| CrewAI | Knowledge base |
Third, the in-memory mode means there is no operational overhead during development. Developers can iterate on their embedding strategy, chunking approach, and retrieval parameters without waiting for data to be uploaded to a remote service.
The tradeoff is that Chroma's simple architecture can become a limitation at production scale. For applications handling hundreds of millions of vectors, distributed databases like Milvus, Weaviate, or Pinecone are better suited. Many teams follow a path of prototyping with Chroma, then migrating to a more scalable solution for production [8].
Chroma supports three primary deployment patterns, each suited to different scale and operational requirements:
| Pattern | Description | Scale | Recommended for |
|---|---|---|---|
| Local / Embedded | Library runs in the application process | Small; under 1M records | Prototyping, testing, single-user applications |
| Single-node server | Standalone Chroma server accessed via HTTP | Medium; up to ~10M records | Small production workloads, multi-client access |
| Distributed | Multi-service deployment on Kubernetes | Large; millions of collections | Large-scale production with high availability |
All three patterns use the same core codebase and expose the same API, so code written for the embedded pattern works without modification against a remote server or distributed deployment. This is a direct result of Chroma's progressive complexity philosophy [15].
For Docker deployments, Chroma provides an official Docker image:
docker run -p 8000:8000 chromadb/chroma
For Kubernetes deployments, the community maintains Helm charts, and Chroma's distributed mode splits the system into multiple services that can be scaled independently.
Chroma Cloud is the company's hosted service, launched to bridge the gap between local prototyping and production deployment. It provides serverless vector, hybrid, and full-text search as a managed service.
| Feature | Description |
|---|---|
| Setup time | Under 30 seconds to create a database |
| Free credits | $5 of free credits for new users |
| Storage | Object storage with automatic data tiering and caching |
| Replication | Multi-region replication |
| Management | Zero-ops, no infrastructure to manage |
| Encryption | Customer-managed encryption keys (January 2026) |
Chroma Cloud allows developers to use the same API they used during local development, just by changing the client connection string. This continuity from local prototype to cloud deployment is a direct result of Chroma's progressive complexity philosophy [1].
Chroma's performance profile reflects its design priorities. For small to medium datasets (under 10 million vectors), Chroma delivers competitive query latency, especially when running in embedded mode where there is no network overhead.
The 2025 Rust-core rewrite was a significant performance milestone. By reimplementing the core data path in Rust and exposing it to Python through PyO3 bindings, Chroma eliminated Python's Global Interpreter Lock (GIL) bottleneck. This enabled true multi-threaded execution, delivering performance improvements of up to 4x for both writes and queries [15].
In August 2025, an optimization using base64 encoding for vector data in transit achieved a 70% increase in data throughput, further improving performance for bulk data operations [9].
For latency-sensitive applications, the embedded persistent client offers the lowest latency since it avoids network round trips entirely. The HTTP client adds network latency but enables multi-client access and horizontal scaling.
Chroma has added several features beyond its original lightweight embedding store.
| Feature | Date | Description |
|---|---|---|
| Sparse vector search | November 2025 | First-class support for BM25 and SPLADE vectors |
| Chroma Sync | October 2025 | Automatically chunk, embed, and index GitHub repositories |
| Chroma Web Sync | December 2025 | Automatically crawl, scrape, chunk, and embed web pages |
| Customer-managed encryption | January 2026 | Encrypt data with user-provided keys |
| 70% data throughput increase | August 2025 | Performance improvement via base64 vector encoding |
| Rust core rewrite | 2025 | Up to 4x performance improvement for writes and queries |
The addition of sparse vector search in November 2025 was notable because it moved Chroma closer to feature parity with Weaviate and Pinecone on hybrid search. With sparse vectors, Chroma can now combine semantic similarity with keyword-level matching, a capability that was previously only available in more full-featured vector databases [9].
Chroma Sync and Web Sync represent a different strategic direction: reducing the data pipeline work required before vectors can be stored. Instead of requiring developers to write their own crawling, chunking, and embedding code, Chroma handles the entire pipeline from source (GitHub repo or web URL) to indexed vectors [9].
Chroma occupies a distinct position in the vector database market, optimized for developer experience and rapid prototyping rather than enterprise scale.
| Feature | Chroma | Pinecone | Weaviate | Qdrant |
|---|---|---|---|---|
| License | Apache 2.0 | Proprietary | BSD-3-Clause | Apache 2.0 |
| Primary language | Python/Rust | Managed service | Go | Rust |
| Setup complexity | Minimal (pip install) | Low (cloud API) | Moderate (Docker/K8s) | Moderate (Docker/K8s) |
| In-memory mode | Yes | No | No | No |
| Hybrid search | Yes (since Nov 2025) | Yes (sparse-dense) | Yes (BM25 + vector) | Yes |
| Managed cloud | Chroma Cloud | Pinecone (only option) | Weaviate Cloud | Qdrant Cloud |
| Best for | Prototyping, small-medium scale | Enterprise managed service | Flexible open-source at scale | High-performance open-source |
| GitHub stars | 25,000+ | N/A (closed source) | 14,000+ | 22,000+ |
| Multi-modal support | Yes (via OpenCLIP) | No | Yes (via CLIP/ImageBind modules) | Yes (multi-vector) |
The main advantage Chroma holds over its competitors is the speed of getting started. No other vector database matches the simplicity of pip install chromadb followed by three lines of Python to have a working vector store. The main limitation is scalability: Chroma's single-process architecture means it cannot distribute data across multiple nodes the way Milvus, Weaviate, or Qdrant can, though the distributed deployment pattern introduced for larger workloads is beginning to address this gap [8].
For developers and teams choosing between these options, the decision often comes down to where they are in the development lifecycle. Chroma is the best choice for exploration, prototyping, and applications with modest data volumes. As the dataset grows or production requirements become more demanding, teams typically evaluate Pinecone (for managed simplicity), Weaviate (for open-source flexibility with hybrid search), or Qdrant (for raw performance).
Jeff Huber has been vocal about the concept of "context engineering" as distinct from prompt engineering. In his view, the quality of large language model outputs depends less on how prompts are worded and more on what context (retrieved documents, user history, structured data) is provided to the model. Chroma's development roadmap reflects this perspective: features like Sync and Web Sync are designed to make it easier to build and maintain the context that gets fed to language models, rather than just storing and retrieving raw vectors [10].
This framing positions Chroma not just as a vector database but as context infrastructure for AI applications, a layer that sits between raw data sources and language models, responsible for ensuring the model has access to the right information at inference time.
By early 2026, Chroma has expanded well beyond its origins as a lightweight embedding store. The addition of sparse vector search, cloud hosting, data sync features, and customer-managed encryption shows a trajectory toward becoming a more complete data platform for AI applications.
The project maintains its developer-first identity. The chromadb PyPI package continues to see over 5 million monthly downloads, and Chroma remains the default vector database in many AI tutorials, courses, and quickstart guides. Its presence in the LangChain and LlamaIndex ecosystems means new AI developers often encounter Chroma before any other vector database [1].
The competitive challenge for Chroma is bridging the gap between prototyping tool and production infrastructure. Chroma Cloud is the company's answer to this challenge, but it faces competition from well-funded managed services (Pinecone) and mature open-source projects (Weaviate, Qdrant, Milvus) that already have production-grade distributed architectures. Whether Chroma can retain its users as they move from prototype to production will likely determine the company's long-term trajectory.