What is GraphRAG?

GraphRAG enhances traditional RAG by structuring knowledge as a graph. Instead of just retrieving text chunks, it understands relationships between entities - enabling complex queries that span multiple connected concepts.

  • Entity relationships: Understand how concepts connect
  • Multi-hop reasoning: Answer questions requiring multiple facts
  • Global understanding: Summarize across entire document sets
  • Structured retrieval: Query by relationship, not just similarity

GraphRAG vs Traditional RAG

Traditional RAG

Retrieves similar text chunks. Struggles with "What are all the relationships between X and Y?"

GraphRAG

Traverses knowledge graph. Excels at relational and summarization queries.

Building a Knowledge Graph

from langchain_openai import ChatOpenAI
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_community.graphs import Neo4jGraph

# Initialize graph database
graph = Neo4jGraph(
    url="bolt://localhost:7687",
    username="neo4j",
    password="password"
)

# Create graph transformer
llm = ChatOpenAI(model="gpt-4", temperature=0)
transformer = LLMGraphTransformer(llm=llm)

# Extract entities and relationships from documents
documents = [...]  # Your documents
graph_documents = transformer.convert_to_graph_documents(documents)

# Store in Neo4j
graph.add_graph_documents(graph_documents)

# Query the graph
result = graph.query("""
    MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
    WHERE c.name = 'Acme Corp'
    RETURN p.name, p.role
""")

Microsoft GraphRAG

# Install Microsoft GraphRAG
pip install graphrag

# Initialize project
python -m graphrag.index --init --root ./my_project

# Index documents
python -m graphrag.index --root ./my_project

# Query
python -m graphrag.query \
    --root ./my_project \
    --method global \
    --query "What are the main themes in the documents?"

# Python usage
from graphrag.query.llm.oai.chat_openai import ChatOpenAI
from graphrag.query.structured_search.global_search.search import GlobalSearch

# Global search for high-level summarization
global_search = GlobalSearch(
    llm=ChatOpenAI(model="gpt-4"),
    context_builder=context_builder,
    response_type="multiple paragraphs"
)

result = await global_search.asearch("What are the key findings?")

Neo4j + LangChain GraphRAG

from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain_openai import ChatOpenAI

# Connect to Neo4j
graph = Neo4jGraph(
    url="bolt://localhost:7687",
    username="neo4j",
    password="password"
)

# Create QA chain that generates Cypher queries
chain = GraphCypherQAChain.from_llm(
    llm=ChatOpenAI(model="gpt-4"),
    graph=graph,
    verbose=True,
    return_intermediate_steps=True
)

# Natural language to graph query
result = chain.invoke({
    "query": "Who are all the people connected to Project Alpha?"
})

print(result["result"])
# The LLM generates: MATCH (p:Person)-[:WORKS_ON]->(proj:Project {name: 'Project Alpha'}) RETURN p

Hybrid Vector + Graph Retrieval

class HybridGraphRAG:
    def __init__(self):
        self.vector_store = Chroma(...)
        self.graph = Neo4jGraph(...)
        self.llm = ChatOpenAI(model="gpt-4")

    def retrieve(self, query: str) -> dict:
        # 1. Vector search for relevant chunks
        vector_results = self.vector_store.similarity_search(query, k=3)

        # 2. Extract entities from query
        entities = self.extract_entities(query)

        # 3. Graph traversal for relationships
        graph_results = []
        for entity in entities:
            neighbors = self.graph.query(f"""
                MATCH (e {{name: '{entity}'}})-[r]-(connected)
                RETURN e, type(r) as relation, connected
                LIMIT 10
            """)
            graph_results.extend(neighbors)

        return {
            "text_context": [doc.page_content for doc in vector_results],
            "graph_context": graph_results
        }

    def answer(self, query: str) -> str:
        context = self.retrieve(query)

        prompt = f"""Use both the text and graph context to answer.

Text Context:
{context['text_context']}

Graph Context (Entity Relationships):
{context['graph_context']}

Question: {query}"""

        return self.llm.invoke(prompt).content

Use Cases

Enterprise Knowledge

Map organizational relationships, projects, and dependencies.

Research Analysis

Connect research papers, authors, and citations.

Customer 360

Unified view of customer interactions and history.

Compliance

Track regulatory relationships and requirements.

Best Practices

  • Start with clear schema: Define entity types and relationships upfront
  • Validate extractions: LLM entity extraction isn't perfect
  • Combine approaches: Use vector + graph for best results
  • Index appropriately: Create graph indexes for common queries

Master Advanced RAG Techniques

Our Agentic AI program covers GraphRAG and advanced retrieval patterns.

Explore Agentic AI Program

Related Articles