Dakera vs Letta (formerly MemGPT)
Letta (MemGPT) and Dakera approach agent memory from fundamentally different angles. Letta uses LLMs to actively manage memory — the model decides what to remember and forget. Dakera is a dedicated retrieval engine with deterministic memory operations. One uses AI to manage memory; the other is infrastructure that AI agents call into.
Feature Comparison
| Feature | Dakera | Letta (MemGPT) |
|---|---|---|
| Category | Memory Retrieval Engine | Agent Framework with Memory |
| Memory Management | Deterministic (algorithmic decay, scoring) | LLM-powered (model decides what to store/forget) |
| Language | Rust (single binary) | Python |
| Retrieval | Hybrid HNSW + BM25, RRF, cross-encoder | LLM-directed search over archival memory |
| Memory Tiers | Flat store with decay + importance scoring | Core memory (system prompt) + archival (vector) + recall (conversation) |
| Context Management | Not applicable (stores/retrieves memories) | Virtual context management (OS-like paging) |
| Agent Framework | No (memory infrastructure only) | Yes (full agent with tools, personas, memory) |
| LLM Dependency | None for core ops (embeddings are local ONNX) | Requires LLM for all memory operations |
| Knowledge Graph | Entity extraction (GLiNER), BFS | Not built-in |
| Memory Decay | 6 configurable strategies | LLM-decided (non-deterministic) |
| MCP Tools | 83 tools | Not available (own tool system) |
| SDKs | Python, TypeScript, Go, Rust | Python |
| Cost per Query | ~0 (local inference only) | LLM API cost per memory operation |
| License | MIT SDKs, proprietary server | Apache 2.0 |
Architecture Differences
Dakera
A memory storage and retrieval engine. Your agent stores memories via API, and retrieves them via hybrid search with reranking. Dakera does not make decisions about what to remember — your agent does. Memory decay is algorithmic and deterministic: configurable strategies (time-based, access-count, importance scoring) manage memory lifecycle predictably. The engine runs entirely on local inference (ONNX) with no LLM calls.
Letta (MemGPT)
An agent framework that treats memory like an operating system. Inspired by virtual memory in OS design, Letta uses an LLM to actively manage what goes into "core memory" (the system prompt), "archival memory" (long-term vector storage), and "recall memory" (recent conversation history). The LLM decides when to page information in and out of context. This creates a self-managing memory system, but every memory operation costs an LLM API call and is inherently non-deterministic.
Deployment Model
| Aspect | Dakera | Letta |
|---|---|---|
| Setup | Docker pull + run (single binary) | pip install + LLM API key |
| Runtime Dependencies | None (self-contained ONNX) | LLM API (OpenAI, Anthropic, etc.) |
| Latency | ~5-50ms per query (local inference) | ~500-2000ms per memory op (LLM round-trip) |
| Cost Model | Fixed (your infra only) | Variable (LLM tokens per operation) |
| Determinism | Deterministic (same query = same results) | Non-deterministic (LLM may vary) |
| Scale | Handles millions of memories per namespace | Limited by LLM context and API throughput |
Pricing Comparison
| Aspect | Dakera | Letta |
|---|---|---|
| Software | Free (self-hosted) | Free (Apache 2.0) |
| Per Memory Operation | ~$0 (local ONNX inference) | ~$0.001-0.01 (LLM API call per operation) |
| 1M Memory Ops/month | ~$10-30 (VPS cost only) | ~$1,000-10,000 (LLM API costs) |
| Cloud/Enterprise | Coming soon | Letta Cloud (managed platform) |
The cost difference is significant at scale. Every memory operation in Letta requires an LLM inference call, while Dakera's operations use only local ONNX models (embedding + reranking). For high-volume agent memory workloads, this difference is orders of magnitude.
When to Choose
Choose Letta if:
- You want a complete agent framework (not just memory infrastructure)
- LLM-powered memory management (the model decides what matters) fits your design
- You like the "virtual memory / OS" metaphor for context management
- Your agent has low memory operation volume (cost stays manageable)
- You want an open-source (Apache 2.0) agent framework with memory built-in
- Non-deterministic memory behavior is acceptable for your use case
- You are building a Python-only stack
Choose Dakera if:
- You need deterministic, predictable memory behavior (same query = same results)
- Cost matters at scale — you cannot afford LLM calls for every memory operation
- Low latency is critical (5-50ms vs 500-2000ms per operation)
- You already have an agent framework and need memory infrastructure to plug into it
- You need hybrid retrieval with BM25 + vector + cross-encoder reranking
- Knowledge graphs, memory decay strategies, and session management are requirements
- You need SDKs beyond Python (TypeScript, Go, Rust)
- MCP integration for IDE-based workflows is important
Verdict
Letta and Dakera are complementary more than competing. Letta is an agent framework where memory is managed by the LLM itself — creative and powerful, but expensive and non-deterministic at scale. Dakera is memory infrastructure that any agent framework can call into — deterministic, fast, and cost-effective. If you want the LLM to manage its own memory (and can afford the API costs), Letta's approach is innovative. If you want reliable, fast memory retrieval as infrastructure for your existing agent stack, Dakera is the better foundation. Many teams use Dakera as the archival memory backend behind agent frameworks.
Try Dakera Free
Deterministic memory retrieval at 5-50ms latency. No LLM API costs for memory operations.
Get Started