When Knowledge Graphs Meet LLMs: A Practical Guide

LLMs are great at language. Knowledge graphs are great at facts. Combining them gives you something better than either alone.

Why Knowledge Graphs?

LLMs hallucinate. They’re trained on text, so they’re good at sounding right — but they don’t have a reliable source of truth. A knowledge graph gives you exactly that: a structured, queryable store of facts with explicit relationships.

The pattern I’ve been using: use the LLM for language understanding and generation, use the knowledge graph for factual grounding.

The Architecture

User Query
    ↓
LLM: Extract entities & intent
    ↓
SPARQL query generation
    ↓
Knowledge Graph (Apache Jena / Fuseki)
    ↓
Structured facts
    ↓
LLM: Generate grounded response

Entity Extraction

First, extract entities from the user query:

def extract_entities(query: str) -> dict:
    prompt = f"""Extract named entities from this query.
Return JSON with keys: persons, organizations, locations, concepts.

Query: {query}"""
    response = llm.complete(prompt)
    return json.loads(response)

SPARQL Query Generation

Then generate a SPARQL query to fetch relevant facts:

PREFIX schema: <https://schema.org/>

SELECT ?subject ?predicate ?object
WHERE {{
  ?subject schema:name "{entity_name}" .
  ?subject ?predicate ?object .
}}
LIMIT 20

Grounded Response Generation

Finally, pass the graph results to the LLM as context:

def generate_grounded_response(query: str, facts: list[dict]) -> str:
    facts_text = "\n".join([f"- {f['subject']} {f['predicate']} {f['object']}" for f in facts])
    prompt = f"""Answer the question using ONLY the provided facts.
If the facts don't contain the answer, say so.

Facts:
{facts_text}

Question: {query}"""
    return llm.complete(prompt)

What This Buys You

Accuracy: Answers are grounded in verified facts
Explainability: You can trace exactly which graph triples informed the answer
Updatability: Update the graph without retraining the model

The tradeoff is complexity — you need to maintain the graph, keep it current, and handle the cases where the graph doesn’t have what you need (fall back to RAG or pure LLM).

For domains with well-structured knowledge (medical, legal, enterprise data), this hybrid approach is worth the investment.