Back to Blog

Building Multi-Agent Memory Systems: Architecture Patterns

How to design memory architectures where multiple AI agents share context, maintain isolation, and collaborate through a unified memory layer.

The Multi-Agent Memory Problem

Modern AI applications rarely consist of a single agent. A customer support system might have a triage agent, a technical agent, and a billing agent. A coding assistant might have a planning agent, an implementation agent, and a review agent. Each agent needs its own working memory, but they also need to share relevant context.

The naive approach — giving every agent access to every memory — creates noise. A billing agent doesn't need to see debugging sessions. A planning agent doesn't need to know about code formatting preferences. The challenge is designing a memory architecture that enables collaboration without creating an undifferentiated soup of context.

Pattern 1: Namespace Isolation with Shared Layers

The most common pattern uses separate namespaces for each agent's private memory, plus a shared namespace for cross-agent context:

from dakera import Dakera

client = Dakera(base_url="http://localhost:3300")

# Each agent has a private namespace
TRIAGE_NS = "agent-triage"
TECH_NS = "agent-technical"
BILLING_NS = "agent-billing"

# Plus a shared namespace for cross-agent context
SHARED_NS = "shared-customer-context"

# Triage agent stores its observations
client.memory.add(
    namespace=TRIAGE_NS,
    content="Customer expressed frustration about repeated billing errors",
    metadata={"customer_id": "cust_42", "sentiment": "negative"}
)

# Triage also writes to shared for other agents
client.memory.add(
    namespace=SHARED_NS,
    content="Customer cust_42 has ongoing billing issue, handle with care",
    metadata={"customer_id": "cust_42", "source_agent": "triage"}
)

When the billing agent picks up the conversation, it searches both its own namespace and the shared one:

# Billing agent retrieves context from both namespaces
private_context = client.memory.search(
    namespace=BILLING_NS,
    query="customer cust_42 billing history",
    limit=5
)

shared_context = client.memory.search(
    namespace=SHARED_NS,
    query="customer cust_42",
    limit=3,
    metadata_filter={"customer_id": "cust_42"}
)

When to Use This Pattern

Pattern 2: Session-Based Memory with Agent Scoping

When multiple agents collaborate on a single task in sequence (like a pipeline), sessions provide a natural boundary. Each session represents one unit of work, and agents contribute memories within that session:

# Create a session for this customer interaction
session = client.session.create(
    namespace="support-pipeline",
    metadata={"customer_id": "cust_42", "ticket_id": "TKT-1234"}
)

# Triage agent adds to the session
client.memory.add(
    namespace="support-pipeline",
    session_id=session.id,
    content="Initial diagnosis: billing discrepancy on invoice INV-5678",
    metadata={"agent": "triage", "step": 1}
)

# Technical agent picks up the session and adds its findings
client.memory.add(
    namespace="support-pipeline",
    session_id=session.id,
    content="Root cause: proration calculation error during plan upgrade on March 3",
    metadata={"agent": "technical", "step": 2}
)

# Billing agent resolves using full session context
session_memories = client.memory.search(
    namespace="support-pipeline",
    session_id=session.id,
    query="what happened with this billing issue",
    limit=10
)

When to Use This Pattern

Pattern 3: Knowledge Graph as Shared Context

For complex multi-agent systems where relationships between entities matter more than individual memories, the knowledge graph becomes the shared layer. Each agent contributes entities and relationships, building a collective understanding:

# Research agent discovers a relationship
client.knowledge_graph.add_edge(
    namespace="company-intelligence",
    source={"type": "person", "name": "Alice Chen"},
    target={"type": "company", "name": "Acme Corp"},
    edge_type="works_at",
    metadata={"discovered_by": "research-agent", "confidence": 0.95}
)

# Sales agent adds deal context
client.knowledge_graph.add_edge(
    namespace="company-intelligence",
    source={"type": "company", "name": "Acme Corp"},
    target={"type": "deal", "name": "Enterprise Plan Q2"},
    edge_type="considering",
    metadata={"discovered_by": "sales-agent", "stage": "evaluation"}
)

# Any agent can traverse the graph
connections = client.knowledge_graph.traverse(
    namespace="company-intelligence",
    start={"type": "person", "name": "Alice Chen"},
    max_depth=2
)
# Returns: Alice -> works_at -> Acme Corp -> considering -> Enterprise Plan Q2

Dakera's knowledge graph supports 4 edge types that cover most agent collaboration scenarios: relates_to, works_at, part_of, and depends_on. Custom edge types can be defined for domain-specific relationships.

When to Use This Pattern

Pattern 4: Event-Sourced Memory

In systems where agents need to react to each other's discoveries in real-time, an event-sourced pattern works well. Each agent publishes memories as events, and other agents subscribe to relevant namespaces:

# Monitoring agent detects an anomaly
client.memory.add(
    namespace="system-events",
    content="CPU usage on prod-server-3 exceeded 95% for 5 minutes",
    metadata={
        "event_type": "anomaly",
        "severity": "high",
        "source_agent": "monitor",
        "timestamp": "2026-05-16T14:32:00Z"
    }
)

# Diagnosis agent polls for new high-severity events
recent_events = client.memory.search(
    namespace="system-events",
    query="high severity anomaly",
    metadata_filter={"severity": "high"},
    limit=5,
    recency_weight=0.8  # Heavily weight recent events
)

Isolation Guarantees

Regardless of which pattern you choose, multi-agent memory requires strong isolation guarantees. A bug in one agent shouldn't corrupt another's context. Dakera provides isolation at multiple levels:

LevelScopeIsolation
CompanyEntire tenantSeparate data directory + encryption key
NamespaceAgent or domainSeparate HNSW index + BM25 index
SessionSingle task/conversationFiltered within namespace
MetadataCustom scopingQuery-time filtering

Scaling Multi-Agent Memory

Concurrent Access

Dakera handles concurrent reads and writes from multiple agents without locking at the namespace level. Each namespace maintains its own write-ahead log, so agents writing to different namespaces never contend. Agents writing to the same namespace experience serialized writes but parallel reads — which matches the typical access pattern where many agents read shared context but fewer write to it.

Memory Pruning

In multi-agent systems, memory accumulates fast. If 5 agents each store 100 memories per hour, you have 12,000 new memories per day. Dakera's decay strategies automatically reduce the relevance of old memories, and you can configure per-namespace retention:

# Configure aggressive decay for ephemeral agent working memory
client.namespace.configure(
    namespace="agent-triage-scratch",
    decay_strategy="exponential",
    decay_half_life_hours=24,
    max_memories=10000
)

# Configure slow decay for long-term shared knowledge
client.namespace.configure(
    namespace="shared-customer-context",
    decay_strategy="logarithmic",
    decay_half_life_hours=720,  # 30 days
    max_memories=500000
)

Real-World Example: Multi-Agent Code Review

Here's a complete example of three agents collaborating on code review using Dakera's memory:

from dakera import Dakera

client = Dakera(base_url="http://localhost:3300")

class SecurityAgent:
    NS = "review-security"

    def review(self, pr_diff: str, session_id: str):
        # Check memory for known vulnerability patterns
        past_vulns = client.memory.search(
            namespace=self.NS,
            query=f"vulnerability patterns in {pr_diff[:200]}",
            limit=5
        )

        # Store findings for other agents
        client.memory.add(
            namespace="review-shared",
            session_id=session_id,
            content=f"Security review: no SQL injection risks, "
                    f"but found hardcoded timeout of 30s in retry logic",
            metadata={"agent": "security", "risk_level": "low"}
        )

class PerformanceAgent:
    NS = "review-performance"

    def review(self, pr_diff: str, session_id: str):
        # Reference shared findings from security agent
        security_notes = client.memory.search(
            namespace="review-shared",
            session_id=session_id,
            query="security findings",
            limit=3
        )

        # Add performance perspective
        client.memory.add(
            namespace="review-shared",
            session_id=session_id,
            content=f"Performance review: the hardcoded 30s timeout "
                    f"(flagged by security) will cause connection pool exhaustion "
                    f"under load. Recommend configurable timeout with 5s default.",
            metadata={"agent": "performance", "risk_level": "medium"}
        )

class SummaryAgent:
    def summarize(self, session_id: str):
        # Pull all findings from the review session
        all_findings = client.memory.search(
            namespace="review-shared",
            session_id=session_id,
            query="review findings and recommendations",
            limit=20
        )
        return all_findings

Anti-Patterns to Avoid

Choosing the Right Pattern

Most production multi-agent systems combine patterns. A typical architecture uses:

  1. Namespace isolation for each agent's private working memory
  2. Sessions for pipeline-style collaboration within a single task
  3. Knowledge graph for long-lived entity relationships that span tasks
  4. Shared namespace with metadata filtering for cross-agent coordination

Start with the simplest pattern that meets your needs (usually Pattern 1), and add complexity only when you observe specific limitations. The memory architecture should reflect the communication topology of your agents — if two agents never need to share context, don't create infrastructure for it.

Try Dakera Today

Single binary, zero dependencies, 87.6% LoCoMo benchmark.

Get Started

Ready to get started?

Add long-term memory to your AI agents in minutes.

Get Started Free