Haystack: Building Production RAG Pipelines

What is Haystack?

Haystack is an open-source framework by deepset for building production-ready LLM applications, particularly excelling at RAG (Retrieval Augmented Generation) and semantic search. It's designed with a modular, pipeline-based architecture that makes it easy to build, customize, and scale AI applications.

Key features:

Pipeline-based: Composable, modular component architecture
Production-ready: Built for scale with async support
Flexible: Works with any LLM, embedder, or vector store
Batteries included: Pre-built components for common tasks

Haystack 2.0 Architecture

Haystack 2.0 introduced a completely redesigned architecture:

# Haystack 2.0 Pipeline Architecture

┌─────────────────────────────────────────────────────┐
│                     Pipeline                         │
├─────────────────────────────────────────────────────┤
│                                                      │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐     │
│   │Converter │ -> │ Splitter │ -> │ Embedder │     │
│   └──────────┘    └──────────┘    └──────────┘     │
│                                          │          │
│                                          ▼          │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐     │
│   │Generator │ <- │ Retriever│ <- │  Writer  │     │
│   └──────────┘    └──────────┘    └──────────┘     │
│                                                      │
└─────────────────────────────────────────────────────┘

Getting Started

Installation

# Core installation
pip install haystack-ai

# With specific integrations
pip install "haystack-ai[opensearch]"
pip install "haystack-ai[chroma]"

Basic RAG Pipeline

from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Document

# Create document store
document_store = InMemoryDocumentStore()

# Add documents
documents = [
    Document(content="Python is a programming language created by Guido van Rossum."),
    Document(content="JavaScript is the language of the web."),
    Document(content="Rust is known for memory safety."),
]
document_store.write_documents(documents)

# Create pipeline
pipeline = Pipeline()

# Add components
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template="""
Given these documents:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Answer the question: {{ question }}
"""))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4"))

# Connect components
pipeline.connect("retriever", "prompt_builder.documents")
pipeline.connect("prompt_builder", "llm")

# Run the pipeline
result = pipeline.run({
    "retriever": {"query": "Who created Python?"},
    "prompt_builder": {"question": "Who created Python?"}
})

print(result["llm"]["replies"][0])

Core Components

Document Stores

from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore

# In-Memory (for development)
store = InMemoryDocumentStore()

# Chroma
store = ChromaDocumentStore(persist_path="./chroma_db")

# OpenSearch
store = OpenSearchDocumentStore(hosts=["http://localhost:9200"])

# Pinecone
store = PineconeDocumentStore(
    api_key="your-key",
    environment="us-west1-gcp",
    index="my-index"
)

Embedders

from haystack.components.embedders import (
    OpenAITextEmbedder,
    OpenAIDocumentEmbedder,
    SentenceTransformersTextEmbedder,
    SentenceTransformersDocumentEmbedder
)

# OpenAI embeddings
text_embedder = OpenAITextEmbedder(model="text-embedding-3-small")
doc_embedder = OpenAIDocumentEmbedder(model="text-embedding-3-small")

# Local embeddings with Sentence Transformers
text_embedder = SentenceTransformersTextEmbedder(
    model="sentence-transformers/all-MiniLM-L6-v2"
)
doc_embedder = SentenceTransformersDocumentEmbedder(
    model="sentence-transformers/all-MiniLM-L6-v2"
)

Generators (LLMs)

from haystack.components.generators import OpenAIGenerator
from haystack_integrations.components.generators.anthropic import AnthropicGenerator
from haystack.components.generators import HuggingFaceLocalGenerator

# OpenAI
generator = OpenAIGenerator(model="gpt-4")

# Anthropic Claude
generator = AnthropicGenerator(model="claude-3-sonnet-20240229")

# Local model
generator = HuggingFaceLocalGenerator(
    model="mistralai/Mistral-7B-Instruct-v0.1"
)

Building an Indexing Pipeline

from haystack import Pipeline
from haystack.components.converters import PyPDFToDocument, TextFileToDocument
from haystack.components.preprocessors import DocumentCleaner, DocumentSplitter
from haystack.components.embedders import OpenAIDocumentEmbedder
from haystack.components.writers import DocumentWriter
from haystack_integrations.document_stores.chroma import ChromaDocumentStore

# Create document store
document_store = ChromaDocumentStore(persist_path="./chroma_db")

# Build indexing pipeline
indexing_pipeline = Pipeline()

# Add components
indexing_pipeline.add_component("converter", PyPDFToDocument())
indexing_pipeline.add_component("cleaner", DocumentCleaner())
indexing_pipeline.add_component("splitter", DocumentSplitter(
    split_by="sentence",
    split_length=5,
    split_overlap=1
))
indexing_pipeline.add_component("embedder", OpenAIDocumentEmbedder())
indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))

# Connect components
indexing_pipeline.connect("converter", "cleaner")
indexing_pipeline.connect("cleaner", "splitter")
indexing_pipeline.connect("splitter", "embedder")
indexing_pipeline.connect("embedder", "writer")

# Run indexing
indexing_pipeline.run({"converter": {"sources": ["document.pdf"]}})

Building a Query Pipeline

from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack_integrations.components.retrievers.chroma import ChromaEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

# Query pipeline
query_pipeline = Pipeline()

# Components
query_pipeline.add_component("text_embedder", OpenAITextEmbedder())
query_pipeline.add_component("retriever", ChromaEmbeddingRetriever(
    document_store=document_store,
    top_k=5
))
query_pipeline.add_component("prompt_builder", PromptBuilder(template="""
You are a helpful assistant. Answer the question based on the context below.

Context:
{% for doc in documents %}
{{ doc.content }}
---
{% endfor %}

Question: {{ question }}

Answer:"""))
query_pipeline.add_component("llm", OpenAIGenerator(model="gpt-4"))

# Connect
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query_pipeline.connect("retriever.documents", "prompt_builder.documents")
query_pipeline.connect("prompt_builder", "llm")

# Query
result = query_pipeline.run({
    "text_embedder": {"text": "What is the return policy?"},
    "prompt_builder": {"question": "What is the return policy?"}
})

Advanced: Hybrid Search

from haystack import Pipeline
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import TransformersSimilarityRanker

# Hybrid search pipeline
hybrid_pipeline = Pipeline()

# Add BM25 and embedding retrievers
hybrid_pipeline.add_component("bm25_retriever", InMemoryBM25Retriever(
    document_store=document_store,
    top_k=10
))
hybrid_pipeline.add_component("embedding_retriever", ChromaEmbeddingRetriever(
    document_store=document_store,
    top_k=10
))
hybrid_pipeline.add_component("text_embedder", OpenAITextEmbedder())

# Join and rerank
hybrid_pipeline.add_component("joiner", DocumentJoiner())
hybrid_pipeline.add_component("ranker", TransformersSimilarityRanker(
    model="cross-encoder/ms-marco-MiniLM-L-6-v2",
    top_k=5
))

# Connections
hybrid_pipeline.connect("text_embedder.embedding", "embedding_retriever.query_embedding")
hybrid_pipeline.connect("bm25_retriever", "joiner")
hybrid_pipeline.connect("embedding_retriever", "joiner")
hybrid_pipeline.connect("joiner", "ranker")

# Run hybrid search
results = hybrid_pipeline.run({
    "bm25_retriever": {"query": "machine learning"},
    "text_embedder": {"text": "machine learning"},
    "ranker": {"query": "machine learning"}
})

Haystack vs Other Frameworks

Haystack

Best for: Production RAG, semantic search, enterprise deployments

LangChain

Best for: Rapid prototyping, many integrations, agents

LlamaIndex

Best for: Document indexing, simple RAG, quick setup

When to Choose Haystack

Production focus, clean architecture, team projects, search expertise

Best Practices

Separate pipelines: Use different pipelines for indexing and querying
Tune chunking: Experiment with split_length and split_overlap
Use hybrid search: Combine BM25 and semantic for best results
Add reranking: Cross-encoders significantly improve relevance
Monitor latency: Profile each component in production
Version pipelines: Save pipeline configs for reproducibility

Build Production RAG with Expert Guidance

Our Agentic AI program covers Haystack and production RAG patterns. Learn to build scalable AI search applications.

Explore Agentic AI Program

Haystack

What is Haystack?

Haystack 2.0 Architecture

Getting Started

Installation

Basic RAG Pipeline

Core Components

Document Stores

Embedders

Generators (LLMs)

Building an Indexing Pipeline

Building a Query Pipeline

Advanced: Hybrid Search

Haystack vs Other Frameworks

Haystack

LangChain

LlamaIndex

When to Choose Haystack

Best Practices

Build Production RAG with Expert Guidance

Related Articles

RAG: Retrieval Augmented Generation

Vector Databases Explained

LlamaIndex Guide