Knowledge Graphs for AI Agents: From Entity Extraction to Traversal

Why Agents Need Knowledge Graphs

Vector search answers the question: "what memories are similar to this query?" Knowledge graphs answer a fundamentally different question: "how do these things relate to each other?"

Consider a personal assistant agent that knows:

"Alice is the CTO of Acme Corp"
"Acme Corp is evaluating our Enterprise plan"
"Bob reports to Alice"
"Bob mentioned concerns about data residency"

With only vector search, the query "who at Acme Corp has concerns about our product?" might surface the Bob memory if the embeddings align. But it can't reliably traverse from Acme Corp to its employees to their concerns. A knowledge graph makes this traversal explicit and deterministic.

Dakera's Knowledge Graph Model

Dakera's knowledge graph is designed specifically for agent use cases — lightweight, fast to write, and optimized for the kinds of traversals agents need. It uses a simple triple model: source entity, edge type, target entity.

Entity Model

Entities are identified by a type and a name. The type gives the graph structure; the name gives it specificity:

{
  "type": "person",    // person, company, project, concept, tool, location...
  "name": "Alice Chen" // unique within type+namespace
}

Edge Types

Dakera supports 4 built-in edge types that cover the vast majority of agent relationships:

Edge Type	Meaning	Example
`relates_to`	General association	Alice relates_to Project Alpha
`works_at`	Employment/membership	Alice works_at Acme Corp
`part_of`	Containment/hierarchy	Auth Service part_of Platform
`depends_on`	Dependency relationship	Frontend depends_on Auth API

Building the Graph: Entity Extraction

There are two approaches to populating a knowledge graph: explicit insertion and automatic extraction.

Explicit Insertion

When your agent discovers a relationship through conversation or tool use, it can explicitly add it to the graph:

from dakera import Dakera

client = Dakera(base_url="http://localhost:3300")

# Agent learns about a relationship during conversation
client.knowledge_graph.add_edge(
    namespace="crm",
    source={"type": "person", "name": "Alice Chen"},
    target={"type": "company", "name": "Acme Corp"},
    edge_type="works_at",
    metadata={
        "role": "CTO",
        "discovered": "2026-05-16",
        "confidence": 0.95,
        "source": "user_conversation"
    }
)

# Add another relationship
client.knowledge_graph.add_edge(
    namespace="crm",
    source={"type": "person", "name": "Bob Martinez"},
    target={"type": "person", "name": "Alice Chen"},
    edge_type="relates_to",
    metadata={
        "relationship": "reports_to",
        "discovered": "2026-05-16"
    }
)

Extraction from Memories

For agents that process large amounts of unstructured text, you can extract entities and relationships as memories are ingested. A common pattern is to use an LLM to identify entities in each memory, then write them to the graph:

import json
from dakera import Dakera

client = Dakera(base_url="http://localhost:3300")

def ingest_with_extraction(namespace: str, content: str, llm):
    """Store memory and extract entities into knowledge graph."""

    # Store the raw memory
    memory = client.memory.add(namespace=namespace, content=content)

    # Use LLM to extract entities and relationships
    extraction_prompt = f"""Extract entities and relationships from this text.
    Return JSON: {{"entities": [{{"type": "...", "name": "..."}}],
                   "relationships": [{{"source": "...", "target": "...",
                                       "edge_type": "relates_to|works_at|part_of|depends_on"}}]}}

    Text: {content}"""

    result = json.loads(llm.complete(extraction_prompt))

    # Write extracted relationships to graph
    for rel in result["relationships"]:
        source_entity = next(e for e in result["entities"] if e["name"] == rel["source"])
        target_entity = next(e for e in result["entities"] if e["name"] == rel["target"])

        client.knowledge_graph.add_edge(
            namespace=namespace,
            source=source_entity,
            target=target_entity,
            edge_type=rel["edge_type"],
            metadata={"source_memory_id": memory.id}
        )

# Usage
ingest_with_extraction(
    "project-notes",
    "Alice from engineering confirmed that the Auth Service depends on Redis for session storage",
    llm=my_llm_client
)
# Creates: Alice -works_at-> Engineering
#          Auth Service -depends_on-> Redis

Graph Traversal

The primary operation on a knowledge graph is traversal — starting from one entity and following edges to discover related information.

Single-Hop Traversal

# Find all entities directly connected to Alice
connections = client.knowledge_graph.traverse(
    namespace="crm",
    start={"type": "person", "name": "Alice Chen"},
    max_depth=1
)

# Returns:
# [
#   {"entity": {"type": "company", "name": "Acme Corp"},
#    "edge_type": "works_at", "direction": "outgoing"},
#   {"entity": {"type": "person", "name": "Bob Martinez"},
#    "edge_type": "relates_to", "direction": "incoming"}
# ]

Multi-Hop Traversal

# Find everything within 2 hops of Acme Corp
network = client.knowledge_graph.traverse(
    namespace="crm",
    start={"type": "company", "name": "Acme Corp"},
    max_depth=2
)

# Returns: Acme Corp -> Alice Chen -> Bob Martinez
#          Acme Corp -> Enterprise Plan (deal)
#          Alice Chen -> Project Alpha

Filtered Traversal

# Only follow "depends_on" edges to map service dependencies
dependencies = client.knowledge_graph.traverse(
    namespace="infrastructure",
    start={"type": "service", "name": "API Gateway"},
    edge_type="depends_on",
    max_depth=3
)

# Returns the full dependency tree:
# API Gateway -> Auth Service -> Redis
# API Gateway -> Auth Service -> PostgreSQL
# API Gateway -> Rate Limiter -> Redis

Combining Graph + Vector Search

The real power emerges when you combine knowledge graph traversal with vector search. The graph tells you what entities are relevant, and vector search finds detailed memories about those entities:

def graph_enhanced_search(namespace: str, query: str, start_entity: dict):
    """Use graph traversal to expand context, then search memories."""

    # Step 1: Find related entities via graph
    related = client.knowledge_graph.traverse(
        namespace=namespace,
        start=start_entity,
        max_depth=2
    )

    # Step 2: Build an expanded query using entity names
    entity_names = [r["entity"]["name"] for r in related]
    expanded_context = f"{query} (related: {', '.join(entity_names)})"

    # Step 3: Search memories with graph-informed context
    results = client.memory.search(
        namespace=namespace,
        query=expanded_context,
        limit=10
    )

    return {
        "graph_context": related,
        "memories": results
    }

# Example: "What issues does Acme Corp have?"
# Graph expansion finds Alice, Bob, Enterprise Plan
# Vector search uses these as additional context
answer = graph_enhanced_search(
    "crm",
    "what issues does Acme Corp have?",
    {"type": "company", "name": "Acme Corp"}
)

Graph Maintenance

Deduplication

Agents may discover the same relationship multiple times. Dakera handles this by treating edges as upserts — if the same source, target, and edge_type already exist, the metadata is updated rather than creating a duplicate edge.

Confidence Scoring

Not all extracted relationships are equally reliable. Store confidence scores in metadata and filter on them during traversal:

# Store with confidence
client.knowledge_graph.add_edge(
    namespace="research",
    source={"type": "company", "name": "StartupXYZ"},
    target={"type": "person", "name": "Jane Doe"},
    edge_type="works_at",
    metadata={"confidence": 0.6, "source": "inferred_from_linkedin"}
)

# Only traverse high-confidence edges
reliable_network = client.knowledge_graph.traverse(
    namespace="research",
    start={"type": "company", "name": "StartupXYZ"},
    max_depth=2,
    metadata_filter={"confidence": {"$gt": 0.8}}
)

Temporal Edges

Relationships change over time. Alice might leave Acme Corp. Store temporal metadata to track this:

# Mark the old relationship as ended
client.knowledge_graph.update_edge(
    namespace="crm",
    source={"type": "person", "name": "Alice Chen"},
    target={"type": "company", "name": "Acme Corp"},
    edge_type="works_at",
    metadata={"ended": "2026-05-01", "active": False}
)

# Add the new relationship
client.knowledge_graph.add_edge(
    namespace="crm",
    source={"type": "person", "name": "Alice Chen"},
    target={"type": "company", "name": "NewCo"},
    edge_type="works_at",
    metadata={"started": "2026-05-15", "active": True, "role": "VP Engineering"}
)

Real-World Patterns

Customer Intelligence Graph

Map relationships between contacts, companies, deals, and products. Agents can answer "who at Company X has authority over purchasing decisions?" by traversing org-chart edges.

Infrastructure Dependency Graph

Map service dependencies so agents can answer "what will break if Redis goes down?" by traversing depends_on edges from Redis outward.

Project Knowledge Graph

Connect people to projects to decisions to code components. Agents can trace "who decided to use PostgreSQL for the auth service?" through the relationship chain.

Performance Considerations

Dakera's knowledge graph is stored alongside the memory index — no separate database needed. For typical agent workloads (tens of thousands of entities, hundreds of thousands of edges), traversal is sub-millisecond. The graph is held in memory with persistence to the same encrypted storage as memories.

For very large graphs (millions of edges), limit traversal depth to 2-3 hops and use edge type filters to prune the search space. Deep traversals on unfiltered graphs scale with the branching factor, which can explode at depth 4+.

Try Dakera Today

Single binary, zero dependencies, 87.6% LoCoMo benchmark.

Get Started