Knowledge Graphs for RAG (original) (raw)

Last Updated : 10 Nov, 2025

A Knowledge Graph in RAG (Retrieval-Augmented Generation) is a structured representation of information where entities (nodes) and their relationships (edges) are explicitly modeled. It allows a RAG system to retrieve relevant knowledge, understand context, and perform inferential reasoning, enabling the language model to generate more accurate, coherent, and explainable responses.

rag_retriever

KG Rag

RAG applications rely on retrieving relevant information to improve the quality of generated responses. Knowledge graphs bring several advantages to this process.

  1. **Structured Knowledge Representation : Entities and relationships are explicitly model, making it easier to retrieve relevant information.
  2. **Contextual Understanding : Knowledge Graphs capture relationships between entities, providing deeper context for responses.
  3. **Inferential Reasoning : Traversing Knowledge Graphs enables RAG systems to infer information not explicitly stated.
  4. **Knowledge Integration : It can combine information from multiple sources into a unified, structured format.
  5. **Explainability and Transparency : The reasoning behind AI responses is clear, traceable to the graph structure.

How Knowledge Graphs Work in RAG Applications

  1. **Data Collection and Preprocessing : Data is collected from various sources, cleaned, and processed to identify key entities and relationships for graph creation.
  2. **Knowledge Graph Construction : Entities turn into nodes and relationships into edges, each with unique IDs and properties to enable efficient querying and reasoning.
  3. **Storage in a Graph Database : The graph is stored in databases like Neo4j, enabling fast searches and traversals, with indexes speeding up keyword and property-based queries.
  4. **Querying and Traversal : Queries explore the graph by following relationships, helping the system uncover hidden connections and infer new knowledge.
  5. **Integration with RAG : The knowledge graph gives structured context to the language model, which can be combined with embeddings or other data to improve answer accuracy.
  6. **Response Generation : Language model uses the retrieved knowledge to generate clear, fact-based responses, with the graph ensuring accuracy, context, and explainability.

Step-By-Step Implementation

Here load Wikipedia documents, split them into chunks, convert them into a knowledge graph in Neo4j, and then build a retrieval-augmented generation pipeline that queries both structured (graph) and unstructured (vector embeddings) data to answer questions.

Step 1: Environment Setup

import os from google.colab import userdata

GOOGLE_API_KEY = userdata.get("YOUR_API_KEY")

os.environ["GOOGLE_API_KEY"] = "GOOGLE_API_KEY" os.environ["NEO4J_URI"] = "YOUR_NEO4J_URI" os.environ["NEO4J_USERNAME"] = "YOURS_NEO4J_USERNAME" os.environ["NEO4J_PASSWORD"] = "YOUR_NEO4J_PASSWORD"

`

Step 2: Load and Split Documents

from langchain.document_loaders import WikipediaLoader from langchain.text_splitter import TokenTextSplitter

raw_docs = WikipediaLoader(query="Elizabeth I").load()

splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=24) documents = splitter.split_documents(raw_docs[:3])

`

Step 3: Initialize LLM and Graph Transformer

from langchain_google_genai import ChatGoogleGenerativeAI from langchain_experimental.graph_transformers import LLMGraphTransformer from langchain_community.graphs import Neo4jGraph

llm = ChatGoogleGenerativeAI( model="gemini-2.5-flash", temperature=0 )

llm_transformer = LLMGraphTransformer(llm=llm) graph_docs = llm_transformer.convert_to_graph_documents(documents)

graph = Neo4jGraph() graph.add_graph_documents(graph_docs, baseEntityLabel=True, include_source=True)

`

Step 4: Create Vector Index

from langchain_community.vectorstores import Neo4jVector from langchain_google_genai import GoogleGenerativeAIEmbeddings

vector_index = Neo4jVector.from_existing_graph( GoogleGenerativeAIEmbeddings(), search_type="hybrid", node_label="Document", text_node_properties=["text"], embedding_node_property="embedding" )

`

from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.prompts import ChatPromptTemplate

class Entities(BaseModel): names: list[str] = Field(..., description="Person/organization entities in text")

prompt = ChatPromptTemplate.from_messages([ ("system", "You are extracting organization and person entities from the text."), ("human", "Use the given format to extract info: {question}"), ])

entity_chain = prompt | llm.with_structured_output(Entities)

`

Step 6: Structured Retrieval

from langchain_community.vectorstores.neo4j_vector import remove_lucene_chars

def generate_full_text_query(input: str) -> str: words = [el for el in remove_lucene_chars(input).split() if el] return " AND ".join(f"{w}~2" for w in words)

def structured_retriever(question: str) -> str: result = "" entities = entity_chain.invoke({"question": question}).names for entity in entities: resp = graph.query(""" CALL db.index.fulltext.queryNodes('entity', $query, {limit:2}) YIELD node, score CALL { WITH node MATCH (node)-[r:!MENTIONS]->(neighbor) RETURN node.id + ' - ' + type(r) + ' -> ' + neighbor.id AS output UNION ALL WITH node MATCH (node)<-[r:!MENTIONS]-(neighbor) RETURN neighbor.id + ' - ' + type(r) + ' -> ' + node.id AS output } RETURN output LIMIT 50 """, {"query": generate_full_text_query(entity)}) result += "\n".join([el['output'] for el in resp]) return result

`

Step 7: Combined Retriever

def retriever(question: str): structured = structured_retriever(question) unstructured = [d.page_content for d in vector_index.similarity_search(question)] return f"Structured data:\n{structured}\nUnstructured data:\n{'#Document '.join(unstructured)}"

`

Step 8: Condense Follow-Up Questions

from langchain_core.prompts.prompt import PromptTemplate from langchain_core.runnables import RunnableLambda, RunnableBranch, RunnablePassthrough from langchain_core.messages import HumanMessage, AIMessage from langchain_core.output_parsers import StrOutputParser

CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template( """Given chat history and follow-up question, rewrite it as a standalone question. Chat History: {chat_history} Follow Up Input: {question} Standalone question:""" )

def _format_chat_history(chat_history): buffer = [] for human, ai in chat_history: buffer.append(HumanMessage(content=human)) buffer.append(AIMessage(content=ai)) return buffer

_search_query = RunnableBranch( (RunnableLambda(lambda x: bool(x.get("chat_history"))), RunnablePassthrough.assign(chat_history=lambda x: _format_chat_history(x["chat_history"])) | CONDENSE_QUESTION_PROMPT | llm | StrOutputParser()), RunnableLambda(lambda x: x["question"]) )

`

Step 9: RAG QA Chain

from langchain_core.prompts.prompt import ChatPromptTemplate from langchain_core.runnables import RunnableParallel

template = ChatPromptTemplate.from_template( """Answer the question based only on the following context: {context} Question: {question} Use natural language and be concise. Answer:""" )

chain = RunnableParallel({"context": _search_query | retriever, "question": RunnablePassthrough()}) | template | llm | StrOutputParser()

Example queries

print(chain.invoke({"question": "Which house did Elizabeth I belong to?"})) print(chain.invoke({ "question": "When was she born?", "chat_history": [("Which house did Elizabeth I belong to?", "House Of Tudor")] }))

`

**Output:

KG_RAG

Output

You can download full code from here.

Applications

Challenges and Limitations

While integrating Knowledge Graphs into RAG has challenges:

Difference Between RAG and Knowledge Graph RAG

Here we compare Traditional RAG with Knowledge Graph RAG.

Parameters RAG (Retrieval-Augmented Generation) Knowledge Graph Enhanced RAG
Core Approach Retrieve information from unstructured text and generate responses Combine Knowledge Graphs with RAG to enable structured
Knowledge Type Works with unstructured documents or text chunks Uses both structured and unstructured information
Retrieval Method Performs vector similarity search using embeddings Uses hybrid retrieval with graph queries and vector search
Data Representation Stores text in chunks without explicit relationships Represents data as nodes and edges
Context Understanding Limited to text similarity and surface-level meaning Captures deep semantic and relational context between entities.
Reasoning Capability Retrieves facts but cannot infer new relationships Enables multi-hop reasoning and logical inference across connected nodes