Haystack (framework)

Developer Tools Information Retrieval Natural Language Processing Open Source AI

17 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

23 citations

Revision

v6 · 3,422 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Haystack is an open-source AI orchestration framework developed by deepset, a Berlin-based company, for building production-ready natural language processing (NLP), retrieval-augmented generation (RAG), and AI applications in Python. Licensed under Apache 2.0, Haystack lets developers compose modular pipelines from interchangeable components such as retrievers, generators, embedders, and document stores, connecting to model providers like OpenAI, Anthropic, Cohere, Hugging Face, and local inference engines. deepset describes Haystack as "an open-source Python framework for building production-ready LLM applications, with integrations to almost all major model providers and databases." ^[2] The framework has accumulated more than 25,000 stars on GitHub (up from over 24,000 as of September 2025) and is used by thousands of organizations, including the European Commission, Airbus, Intel, NVIDIA, Apple, Netflix, Lufthansa, Lego, and The Economist. ^[1]^[4]

History

deepset and early development

Milos Rusic, Malte Pietsch, and Timo Moller co-founded deepset in Berlin, Germany in June 2018. ^[4] Rusic and Pietsch had met while studying at the Technical University of Munich, and Pietsch and Moller both came from Plista, an adtech startup where they worked on AI-powered ad creation. The three founders were inspired by the Transformer architecture that Google had introduced in 2017, and they bootstrapped the company by training custom NLP models for enterprise clients, tailoring BERT language models to domain-specific tasks.

In July 2019, deepset released FARM (Fast & easy transfer learning for NLP), an open-source library that simplified transfer learning with Transformer models. ^[4]^[19] FARM provided tools for domain adaptation and fine-tuning with features like gradient accumulation, cross-validation, and mixed-precision training. Four months later, in November 2019, deepset released the first version of Haystack, designed as a higher-level framework that combined document retrieval with question answering capabilities. ^[4] FARM's core modeling features were eventually integrated into Haystack, and FARM's standalone development was discontinued in November 2021. ^[4]

How was deepset funded?

deepset has raised approximately $45.6 million across three funding rounds: ^[4]

Round	Date	Amount	Lead Investor	Other Investors
Pre-Seed	March 2021	$1.6 million	System.One	Lunar Ventures
Series A	April 2022	$14 million	GV (Google Ventures)	System.One, Lunar Ventures, Harpoon Ventures
Series B	August 2023	$30 million	Balderton Capital	GV, System.One, Lunar Ventures, Harpoon Ventures

The $14 million Series A, announced April 28, 2022, was led by GV (Google Ventures). ^[5] The Series B round was announced on August 9, 2023, brought total capital raised to about $46 million, and came amid growing enterprise demand for large language model (LLM) tooling. ^[6]^[7]

Haystack 1.x

The original Haystack architecture (versions 1.x) was built around a pipeline-based NLP system where data flowed through a sequence of nodes. Each node performed a specific task: preprocessing documents, retrieving relevant passages, reading text to extract answers, or summarizing content. Pipelines in version 1.x were implemented as directed acyclic graphs (DAGs), meaning data moved in one direction from start to finish without looping back.

Key concepts in 1.x

The three main building blocks in Haystack 1.x were:

Document Stores stored documents and their vector representations. Supported backends included Elasticsearch, OpenSearch, FAISS, and SQL databases. Document Stores could be used as the last node in an indexing pipeline or passed directly to a Retriever.
Retrievers filtered large document collections down to a manageable set of candidates relevant to a given query. Different Retriever implementations handled sparse retrieval (BM25), dense retrieval (using embeddings), and hybrid approaches.
Readers performed extractive question answering by scanning the documents returned by a Retriever and identifying the exact span of text that answered the question. Readers typically used fine-tuned Transformer models.

This Retriever-Reader pattern became Haystack's signature approach to question answering and remained the dominant paradigm through the 1.x release line. The farm-haystack Python package distributed versions 1.x. Haystack 1.x reached end-of-life on March 11, 2025, with version 1.26.4 as the final release, and no longer receives updates. ^[20]

Haystack 2.0

Why was Haystack 2.0 a full rewrite?

As the AI field shifted from extractive QA toward generative approaches powered by LLMs, the Haystack team concluded that incremental changes to the 1.x architecture would not be enough. The original node-based system made assumptions about data flow that did not accommodate new patterns like prompt engineering, multi-step generation, or agentic loops. deepset announced the Haystack 2.0 redesign in mid-2023, released a beta in December 2023, and shipped the stable 2.0.0 release on March 11, 2024, distributed through the new haystack-ai package on PyPI. ^[2]^[9]^[10] In the team's own framing, the redesign preserved "the core of what makes Haystack Haystack: Components and Pipelines." ^[2]

Architecture

Haystack 2.0 is built on two foundational abstractions: the Component protocol and the Pipeline object.

The Component protocol defines a standard API that any Python class must follow to participate in a pipeline. Every component implements a run() method that accepts typed inputs and returns typed outputs. Components explicitly declare the names and types of their inputs and outputs, which allows the pipeline to validate connections before execution and generate clear error messages when something is misconfigured.

The Pipeline object is a directed multigraph that connects components together and handles execution. Unlike the DAG-only approach of 1.x, Haystack 2.0 pipelines support cycles. As deepset explains, "In Haystack 2.0 the pipeline graph can have cycles," which, combined with decision components and routers, "can be used to build sophisticated loops that model agentic behavior." ^[2] This means pipelines can branch, merge, and loop, enabling patterns like iterative refinement, self-correction, and autonomous agent behavior. The Pipeline also handles serialization (to and from YAML), connection validation, and runtime introspection.

Design principles

The 2.0 redesign was guided by four principles:

Technology agnostic: no lock-in to any particular model provider, vector database, or cloud platform. Components can be swapped without rewriting the pipeline.
Explicit connections: components communicate through well-defined inputs and outputs rather than implicit shared state, making pipelines easier to debug and reason about.
Flexibility: custom components are first-class citizens with the same capabilities as built-in ones.
Extensibility: a uniform interface for building and distributing custom components, supporting both deepset-maintained and community-maintained integrations.

Core components

Haystack 2.0 organizes its components into several categories.

Document Stores

A DocumentStore provides persistent storage for Document objects along with their metadata and vector representations. All DocumentStore implementations must expose four methods: count_documents(), filter_documents(), write_documents(), and delete_documents(). ^[11] Haystack ships with an InMemoryDocumentStore for prototyping and testing. Production deployments typically use one of the many third-party DocumentStore integrations:

DocumentStore	Type	Maintained By
Elasticsearch	Search engine	deepset
OpenSearch	Search engine	deepset
Pinecone	Vector database	deepset
Weaviate	Vector database	deepset
Qdrant	Vector database	Qdrant
Chroma	Vector database	deepset
MongoDB Atlas	Document database	MongoDB
pgvector	PostgreSQL extension	deepset
AstraDB	Cloud database	DataStax
Neo4j	Graph database	Neo4j
Marqo	Tensor search	Community
Milvus	Vector database	Community

Retrievers

Retrievers query a DocumentStore and return the most relevant documents for a given input. In Haystack 2.0, each Retriever is specialized for its corresponding DocumentStore rather than implementing a generic interface. ^[12] This means an ElasticsearchBM25Retriever handles all the specifics of querying Elasticsearch, while a QdrantEmbeddingRetriever handles Qdrant's particular API. This design gives each Retriever full access to the advanced features of its backend without being constrained by a lowest-common-denominator interface.

Haystack 2.0 also introduced multi-query retrieval components: MultiQueryTextRetriever runs multiple text queries in parallel, and MultiQueryEmbeddingRetriever does the same with embedding-based search. These components pair with a QueryExpander that generates semantically similar variations of the original query, improving recall for short or ambiguous inputs.

Generators

Generators produce text given a prompt. They are the primary interface to LLMs within Haystack. The framework distinguishes between two types:

Generators (e.g., OpenAIGenerator, HuggingFaceLocalGenerator) accept a plain text prompt and return generated text.
ChatGenerators (e.g., OpenAIChatGenerator, AnthropicChatGenerator) work with a list of chat messages and support features like tool calling and system prompts.

Generators exist for every major model provider, and switching from one to another requires changing only the Generator component in the pipeline while leaving the rest intact.

Embedders

Embedders encode data (text, images, or other modalities) into vector representations. Haystack 2.0 separated embedding into its own component category, decoupled from Retrievers. There are two main types:

DocumentEmbedders (e.g., SentenceTransformersDocumentEmbedder) compute embeddings for Document objects during indexing.
TextEmbedders (e.g., OpenAITextEmbedder) compute embeddings for query strings at query time.

This separation means the same Retriever can be used with different embedding models simply by swapping the Embedder component.

Converters

Converters transform files of various formats into Haystack Document objects. Built-in converters handle plain text, HTML, PDF, DOCX, CSV, and other common formats. For example, HTMLToDocument parses HTML files, PyPDFToDocument extracts text from PDFs, and TextFileToDocument reads plain text files. These converters are typically the first step in an indexing pipeline.

Routers

Routers direct data flow within a pipeline based on conditions. The ConditionalRouter evaluates user-defined conditions and sends data to different branches accordingly. The LLMMessagesRouter uses pattern matching on LLM output to route messages. Routers enable conditional branching and, when combined with cycles in the pipeline graph, support iterative agent-like behavior.

PromptBuilder

The PromptBuilder component uses the Jinja2 template engine to construct prompts dynamically. Given a template with variables, it fills in values at runtime and produces the final text sent to a Generator. Templates can contain conditional logic, loops, and filters, giving developers fine-grained control over prompt construction without writing custom code.

RAG pipelines

The most common use case for Haystack is building RAG systems, which combine information retrieval with text generation to answer questions grounded in specific documents.

Indexing pipeline

An indexing pipeline prepares documents for retrieval. A typical setup includes:

A Converter to read source files (PDF, HTML, etc.) and produce Document objects.
A PreProcessor or DocumentSplitter to break documents into smaller chunks.
A DocumentEmbedder to compute vector representations of each chunk.
A DocumentWriter to store the processed documents and their embeddings in a DocumentStore.

Query pipeline

A query pipeline answers user questions by retrieving relevant documents and generating a response. A basic RAG query pipeline connects three components:

A TextEmbedder encodes the user's question into a vector.
A Retriever finds the most similar documents in the DocumentStore.
A PromptBuilder combines the retrieved documents with the user's question into a prompt.
A Generator sends the prompt to an LLM and returns the answer.

More advanced configurations add a Ranker between the Retriever and PromptBuilder to re-score and filter retrieved documents, or include a DocumentJoiner to merge results from multiple Retrievers (for example, combining BM25 keyword search with semantic embedding search in a hybrid retrieval setup).

Agents

Haystack added first-class support for AI agents through the [Agent](/wiki/agent) component. The Agent implements a loop-based system that uses a ChatGenerator and a set of tools to solve complex queries iteratively. ^[13] At each step, the Agent analyzes the current state, decides whether to call a tool, processes the tool's output, and determines whether to continue or return a final answer.

Tools can be created in three ways:

The Tool class, which wraps a callable with a name and description.
The ComponentTool class, which wraps any Haystack component as a tool.
The @tool decorator, which converts a plain Python function into a tool using its function name and docstring.

The ToolInvoker component parses tool calls from the LLM's output and executes them with the correct arguments. A Toolset groups multiple tools together for the Agent to use.

Developers who need more control can build custom agent pipelines by connecting a ChatGenerator, ConditionalRouter, and ToolInvoker manually, which allows for arbitrary routing logic and multi-agent coordination.

Integrations

Haystack's integration ecosystem includes over 100 packages connecting the framework to external services and tools. ^[14] deepset maintains many of these integrations directly, while others come from partner companies and community contributors.

Model providers

Haystack provides Generator and Embedder components for all major model providers:

Provider	Generator Component	ChatGenerator Component
OpenAI	`OpenAIGenerator`	`OpenAIChatGenerator`
Anthropic	`AnthropicGenerator`	`AnthropicChatGenerator`
Google AI	`GoogleAIGeminiGenerator`	`GoogleAIGeminiChatGenerator`
Cohere	`CohereGenerator`	`CohereChatGenerator`
Mistral AI	`MistralGenerator`	`MistralChatGenerator`
Hugging Face	`HuggingFaceLocalGenerator`	`HuggingFaceLocalChatGenerator`
Ollama	`OllamaGenerator`	`OllamaChatGenerator`
Azure OpenAI	`AzureOpenAIGenerator`	`AzureOpenAIChatGenerator`
Amazon Bedrock	`AmazonBedrockGenerator`	`AmazonBedrockChatGenerator`

Local model support through Hugging Face Transformers and Ollama means pipelines can run entirely on-premises without sending data to external APIs.

Other integration categories

Beyond model providers, the integration ecosystem covers:

Monitoring and observability: Langfuse, Arize, MLflow, OpenLIT
Evaluation frameworks: DeepEval, Ragas
Data ingestion: Firecrawl, Azure Document Intelligence, Unstructured, PaddleOCR
User interfaces: Chainlit, Open WebUI
Search tools: Exa, SerperDev
Distributed computing: Ray

Deployment and production

Pipeline serialization

Haystack pipelines can be serialized to YAML using the dumps() and dump() methods, and deserialized back to Python with loads() and load(). ^[16] Serialization delegates to each component individually, so custom components that follow the protocol are automatically serializable. Secrets (such as API keys) are handled separately to avoid storing sensitive values in plain text. YAML-based pipeline definitions make it possible to version-control pipeline configurations, run experiments by tweaking parameters in a config file, and reproduce results without modifying source code.

Hayhooks

Hayhooks is a companion project that turns any Haystack pipeline into a REST API with a single command. ^[15]^[22] It auto-generates OpenAPI (Swagger) documentation, supports streaming responses, and can produce OpenAI-compatible chat endpoints for integration with tools like Open WebUI. Hayhooks runs as a FastAPI application and can be containerized with Docker and deployed on Kubernetes for production workloads. Developers can add authentication, custom logging, and additional endpoints as needed.

Observability

Haystack supports structured logging and OpenTelemetry tracing out of the box. Integrations with Datadog, Langfuse, and other monitoring platforms allow teams to track pipeline performance, token usage, and error rates in production.

deepset AI Platform

In April 2022, deepset launched deepset Cloud, its commercial managed platform. ^[4] In 2025, the product was rebranded as the Haystack Enterprise Platform (also referred to as the deepset AI Platform). ^[4]^[18] The platform provides a managed environment for building, deploying, and scaling Haystack-based applications without managing infrastructure directly.

The platform supports multiple deployment models: fully managed SaaS, Virtual Private Cloud (VPC), on-premises, and air-gapped environments. It includes pipeline templates, visual pipeline editing, built-in evaluation tools, and access to deepset's engineering team for support. ^[17]

Pricing follows a structure based on platform licensing, agent or application runtime, and optional expert services. A free Studio tier is available for prototyping. Enterprise pricing is custom and available on request. The platform is also listed on the AWS Marketplace.

How does Haystack compare with LangChain and LlamaIndex?

Haystack competes primarily with LangChain and LlamaIndex in the LLM application framework space. Each framework takes a different approach to the same general problem.

Feature	Haystack	LangChain	LlamaIndex
Developer	deepset (Berlin)	LangChain, Inc.	LlamaIndex, Inc.
First release	November 2019	October 2022	November 2022
License	Apache 2.0	MIT	MIT
Primary language	Python	Python, JavaScript/TypeScript	Python, TypeScript
GitHub stars (approx.)	~25,600	~70,000	~42,000
Architecture	Directed multigraph pipelines with cycles	Chain/agent abstractions; LangGraph for graph-based workflows	Index-centric with query engines
Primary strength	Production-grade search and RAG	Rapid prototyping, large integration ecosystem	Complex data ingestion and indexing
Agent support	Built-in `Agent` component with tool calling	LangGraph for stateful agent workflows	Agent framework with tool integration
Pipeline serialization	YAML	JSON (LangGraph)	JSON
Commercial offering	Haystack Enterprise Platform (SaaS/on-prem)	LangSmith (observability), LangGraph Platform	LlamaCloud
Typical overhead per query	~5.9 ms	~10 ms	~6 ms

Haystack's main differentiator is its explicit, graph-based pipeline architecture and its focus on production deployments with built-in serialization, observability, and enterprise support. ^[21] LangChain offers the largest ecosystem of integrations and third-party tools, making it popular for prototyping. LlamaIndex excels at connecting LLMs to diverse data sources through its extensive set of data connectors (over 150).

Which companies use Haystack?

Haystack is used by thousands of organizations, spanning public-sector bodies such as the European Commission and Global 500 enterprises including Airbus, Intel, NVIDIA, Apple, Netflix, Lufthansa, Infineon, and Lego. ^[1]^[4] Several deployments have been described in detail:

Airbus uses Haystack to build applications that recommend aircraft operations guidelines to pilots in the cockpit.
Siemens runs customer-facing NLP pipelines on the Haystack Enterprise Platform.
The Economist uses Haystack for content-related AI applications.
OakNorth Bank and YPulse have deployed Haystack-based applications across finance and market research.
NVIDIA partnered with deepset on a custom AI agent solution architecture using NVIDIA AI Enterprise. ^[23]

deepset reports that enterprise customers using the platform have achieved up to 5x ROI and 40% efficiency gains in document processing workflows. ^[17]

Community and ecosystem

Haystack is developed in the open on GitHub under the Apache 2.0 license. ^[1] The project accepts contributions ranging from bug fixes and documentation improvements to new components and integrations. deepset maintains contributor guidelines and a public roadmap.

Community channels include a Discord server, a discussion forum on GitHub, and a newsletter. deepset also publishes tutorials (many as runnable Colab notebooks), a cookbook of ready-to-use recipes, and extensive documentation covering both getting-started guides and advanced topics. ^[3]

The framework requires Python 3.10 or later (following the end-of-life of Python 3.9 in October 2025).

Recent developments (2025 and 2026)

Haystack has maintained a rapid release cadence, with new minor versions shipping roughly every two to three weeks. ^[20]

Version	Date	Notable additions
2.18.0	September 2025	Agent breakpoints and stepwise debugging
2.19.0	October 2025	`QueryExpander`, `MultiQueryTextRetriever`, `MultiQueryEmbeddingRetriever`
2.20.0	November 2025	Sparse embedding support via `SentenceTransformersSparseTextEmbedder`
2.21.0	December 2025	Resume Agent from `AgentSnapshot`; new breakpoint controls
2.22.0	January 2026	Performance and stability improvements
2.23.0	January 2026	Additional pipeline validation features
2.24.0	February 2026	Extended agent configuration options
2.25.0	February 2026	`SearchableToolset` to reduce context usage; Jinja2 templates in Agents
2.26.0	March 2026	`LLMRanker` for high-quality context; Jinja2 system prompts in Agents

Key themes across recent releases include stronger agent capabilities (tool management, breakpoints, state snapshots), improved retrieval quality (multi-query and sparse embedding support), and more flexible prompt engineering within agent workflows.

References

deepset-ai/haystack, GitHub repository. https://github.com/deepset-ai/haystack ↩
"Haystack 2.0: The Composable Open-Source LLM Framework," Haystack Blog, March 2024. https://haystack.deepset.ai/blog/haystack-2-release ↩
"Introduction to Haystack," Haystack Documentation. https://docs.haystack.deepset.ai/docs/intro ↩
"deepset," Wikipedia. https://en.wikipedia.org/wiki/Deepset ↩
"deepset Raises $14 Million Series A Led By GV for Advanced NLP Platform," BusinessWire, April 28, 2022. https://www.businesswire.com/news/home/20220428005187/en/ ↩
"deepset Announces $30M Series B Led by Balderton Capital," deepset News, August 9, 2023. https://www.deepset.ai/news/funding-announcement-balderton-capital ↩
"Deepset secures $30M to expand its LLM-focused MLOps offerings," TechCrunch, August 9, 2023. https://techcrunch.com/2023/08/09/deepset-secures-30m-to-expand-its-llm-focused-mlops-offerings/ ↩
"Shaping Haystack 2.0," Haystack Blog. https://haystack.deepset.ai/blog/shaping-haystack-v2
"Haystack 2.0.0 Release Notes," Haystack Documentation. https://haystack.deepset.ai/release-notes/2.0.0 ↩
"Introducing Haystack 2.0-Beta and Advent of Haystack," Haystack Blog, December 2023. https://haystack.deepset.ai/blog/introducing-haystack-2-beta-and-advent ↩
"Document Store," Haystack Documentation. https://docs.haystack.deepset.ai/docs/document-store ↩
"Retrievers," Haystack Documentation. https://docs.haystack.deepset.ai/docs/retrievers ↩
"Agents," Haystack Documentation. https://docs.haystack.deepset.ai/docs/agents ↩
"Integrations," Haystack. https://haystack.deepset.ai/integrations ↩
"Hayhooks," Haystack Documentation. https://docs.haystack.deepset.ai/docs/hayhooks ↩
"Serializing Pipelines," Haystack Documentation. https://docs.haystack.deepset.ai/docs/serialization ↩
"Haystack: The AI Framework for Enterprise Builders," deepset. https://www.deepset.ai/products-and-services/haystack ↩
"Introducing Haystack Enterprise Platform," deepset Blog. https://www.deepset.ai/blog/introducing-haystack-enterprise-platform ↩
"deepset-ai/FARM," GitHub repository. https://github.com/deepset-ai/FARM ↩
"Releases," deepset-ai/haystack, GitHub. https://github.com/deepset-ai/haystack/releases ↩
"Best RAG Frameworks 2025," LLM Practical Experience Hub. https://langcopilot.com/posts/2025-09-18-top-rag-frameworks-2024-complete-guide ↩
"Deploy AI Pipelines Faster with Hayhooks," Haystack Blog. https://haystack.deepset.ai/blog/deploy-ai-pipelines-faster-with-hayhooks ↩
"deepset Introduces Custom AI Agent Solution Architecture Built with NVIDIA AI Enterprise," deepset News. https://www.deepset.ai/news/deepset-custom-ai-agent-solution-architecture-nvidia-ai-enterprise ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

5 revisions by 1 contributors · full history

Suggest edit

What links here

Agentic workflow Chroma Dev tools Hybrid search LangChain MLX Structured output Vector database Weaviate

History

deepset and early development

How was deepset funded?

Haystack 1.x

Key concepts in 1.x

Haystack 2.0

Why was Haystack 2.0 a full rewrite?

Architecture

Design principles

Core components

Document Stores

Retrievers

Generators

Embedders

Converters

Routers

PromptBuilder

RAG pipelines

Indexing pipeline

Query pipeline

Agents

Integrations

Model providers

Other integration categories

Deployment and production

Pipeline serialization

Hayhooks

Observability

deepset AI Platform

How does Haystack compare with LangChain and LlamaIndex?

Which companies use Haystack?

Community and ecosystem

Recent developments (2025 and 2026)

See also

References

Improve this article

Related Articles

LlamaIndex

MTEB (Massive Text Embedding Benchmark)

Jina Embeddings v3

LanceDB

Similarity Measure

Vector embeddings

What links here

Related Articles

LlamaIndex

MTEB (Massive Text Embedding Benchmark)

Jina Embeddings v3

LanceDB

Similarity Measure

Vector embeddings

What links here