The problem
Customer support bots treat every conversation as a fresh start. They can't remember a customer's previous issues, preferences, or resolution history. This leads to repetitive questions, inconsistent responses, and frustrated customers who have to re-explain their situation every time they reach out.
A customer who reported a shipping damage issue last week opens a new chat to follow up. The bot has no idea who they are, what happened, or that a refund was already approved. It asks for the order number again. It suggests troubleshooting steps that were already tried. The customer escalates to a human agent — which is exactly what the bot was supposed to prevent.
The root cause is architectural: most support bots are stateless. They process each message in isolation, with no mechanism to store, retrieve, or reason over past interactions. Adding a conversation log isn't enough — you need semantic recall that understands what's relevant, not just what's recent.
How Dakera solves it
Dakera gives your support bot a persistent memory layer that works across sessions, customers, and time.
- Store every conversation with session context and customer-specific agent_id. Each customer interaction is stored as a memory with metadata — tags, importance scores, and session boundaries. The bot can distinguish between different customers and different support threads.
- Recall past interactions semantically — not just keyword matching. Dakera's hybrid retrieval engine combines HNSW vector search with BM25 full-text matching and cross-encoder reranking. When the bot needs to recall "what happened with Jane's return request," it finds the right memory even if the exact words don't match.
- Importance decay ensures recent interactions rank higher while critical context persists. A routine greeting from three months ago fades naturally. An account escalation or a billing dispute retains its importance score because it was stored with high importance. The memory stays sharp without manual curation.
- Knowledge graph links related support tickets and customer entities. Dakera's entity extraction connects memories through shared entities — a customer name, an order number, a product SKU. When the bot recalls context for a customer, it traverses the knowledge graph to surface related tickets, even if the semantic similarity between those tickets is low.
Implementation
Here's how a support bot stores and recalls customer interactions using the Dakera Python SDK:
from dakera import DakeraClient
client = DakeraClient(base_url="http://localhost:3300", api_key="dk-...")
# Store a support interaction
client.store(
agent_id="support-bot",
content="Customer asked about return policy for order #4521. Approved full refund due to shipping damage. Customer was satisfied with resolution.",
importance=0.8,
tags=["support", "returns", "customer:jane-doe"]
)
# Later — recall relevant context for the same customer
memories = client.recall(
agent_id="support-bot",
query="Jane Doe previous interactions and preferences",
top_k=5
)
The recall call returns the most relevant memories ranked by a combination of semantic similarity, keyword match, importance score, and recency. Your bot injects these into the system prompt before generating a response — giving it full context on the customer's history without any manual lookup.
Session management
For multi-turn conversations, use sessions to group related messages:
# Start a new support session
client.session_start(agent_id="support-bot")
# ... store memories within this session ...
# End the session when the conversation closes
client.session_end(session_id=session_id)
Sessions let you recall memories from a specific conversation ("what did the customer say earlier in this chat?") or across all sessions ("what's the full history for this customer?").
Built-in embeddings: Dakera ships with bge-large (1024-dim) embeddings built in. No external API calls to OpenAI or Cohere needed — your customer data never leaves your infrastructure.
Deploy persistent memory for your agents
Self-hosted, no external API dependencies, production-ready. Add memory to your support bot in under 10 minutes.