Architecture
Dakera compiles to a single binary with no runtime dependencies. It exposes a REST API (port 3300) and gRPC API (port 50051), both backed by the same engine. The system includes 4-tier caching, on-device inference for embeddings and reranking, background AutoPilot for memory lifecycle management, and a full distributed mode with gossip-based membership, leader election, sharding, and automatic rebalancing.
Retrieval pipeline
System overview
Memory lifecycle
Every memory in Dakera follows a defined lifecycle from creation through potential archival or deletion. Understanding this flow is key to tuning retention, decay, and storage costs.
| Stage | Location | What happens |
|---|---|---|
| Store | L1 + L2 | Memory created with initial importance. Embedded, indexed, entities extracted. |
| Active | L1 | Frequently recalled. Each access boosts importance and resets decay timer. |
| Decaying | L2 | No recent access. Importance decreases per decay strategy (exponential, linear, or step). |
| Archived | L3 (S3) | Below warm threshold. Moved to cold storage. Still retrievable but with higher latency. |
| Forgotten | — | Importance below minimum threshold, TTL expired, or explicitly deleted via API. |
4-tier caching
Dakera implements a four-level cache hierarchy that balances latency and durability. Each tier handles a different access pattern:
| Tier | Backend | Latency | Purpose |
|---|---|---|---|
| L1 | In-memory LRU | <1ms | Hot cache for frequently accessed memories. Bounded by configurable max entries. |
| L1.5 | Redis | ~1ms | Distributed cache in multi-node deployments. Shared across the cluster. |
| L2 | RocksDB | ~5ms | Persistent disk storage with compression. Primary durable store. |
| L3 | S3 / MinIO | ~50ms | Cold/archival tier. Stores decayed or infrequently accessed memories. |
Reads check L1 → L1.5 → L2 → L3, promoting on hit. Writes go to L1 + L2 synchronously, with async replication to L1.5 and L3.
Vector indexes
Dakera supports four vector index types, selectable per namespace:
| Index | Strengths | Best for |
|---|---|---|
| HNSW | Sub-10ms at millions of vectors, tunable recall/speed | General-purpose (default) |
| IVF | Lower memory, good for high-dimensional data | Large datasets with limited RAM |
| SPFresh | Write-optimized, maintains recall under heavy inserts | High-throughput streaming workloads |
| Flat | Exact nearest-neighbor, no approximation | Small namespaces (<10K vectors), ground-truth evaluation |
All indexes use SIMD-accelerated distance functions (cosine, L2, inner product). Index configuration is set per namespace via the REST API or CLI.
Knowledge graph
Dakera maintains a persistent entity graph alongside vector storage. Entities are extracted automatically (or via API) and linked to memories. The graph supports four edge types:
| Edge type | Meaning |
|---|---|
RelatedTo | Semantic similarity between memories |
SharesEntity | Two memories mention the same named entity |
Precedes | Temporal ordering — memory A happened before B |
LinkedBy | Explicit user-created link via API |
Use the knowledge graph for multi-hop reasoning, entity-centric retrieval, and cross-agent network visualization. Query via dk knowledge CLI or the KG API endpoints.
Event bus & SSE
Dakera includes a built-in event bus that streams real-time notifications over Server-Sent Events (SSE). Subscribe to memory lifecycle events — store, recall, forget, decay — filtered by namespace and agent. Useful for building dashboards, audit UIs, and agent coordination pipelines.
# Subscribe to events for a specific agent
curl -N -H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
"http://localhost:3300/v1/events/stream?agent_id=my-agent"
AutoPilot
AutoPilot runs as a background task that automatically manages memory lifecycle:
- Deduplication — detects near-duplicate memories (cosine similarity ≥0.93) and merges them, preserving the highest-importance version.
- Consolidation — clusters related low-importance memories using DBSCAN and produces summary memories, reducing noise while preserving knowledge.
- Decay enforcement — applies the configured decay strategy to age out stale memories according to their half-life and access patterns.
Configure via DAKERA_AUTOPILOT_DEDUP_INTERVAL_HOURS (default: 1h) and DAKERA_AUTOPILOT_DEDUP_THRESHOLD (default: 0.93). Trigger manually via the /admin/autopilot/trigger endpoint.
Retrieval pipeline
Every recall request flows through an 8-step pipeline:
top_k, return resultsgRPC API
Dakera exposes a gRPC API on port 50051 alongside the REST API. Both APIs access the same underlying engine — choose based on your use case:
| Feature | REST API | gRPC API |
|---|---|---|
| Best for | Web clients, quick integration, debugging | High-throughput services, microservices, streaming |
| Latency | ~2-5ms overhead (HTTP/JSON) | ~0.5-1ms overhead (HTTP/2, protobuf) |
| Streaming | SSE (server-sent events) | Bidirectional streaming |
| Type safety | OpenAPI spec available | Strongly typed via proto definitions |
| Browser support | Native | Requires grpc-web proxy |
# Enable gRPC (enabled by default)
DAKERA_GRPC_ENABLED=true
DAKERA_GRPC_PORT=50051
# For mTLS on the gRPC port, configure at the reverse proxy layer
Backup & WAL
Dakera uses RocksDB's write-ahead log (WAL) for crash recovery. Every write is durably logged before acknowledgment, ensuring no data loss on unexpected shutdown. On restart, the WAL is replayed automatically to restore the last consistent state.
For point-in-time backups, use the admin API (see Deployment → Backup & Restore). Backups include all memories, indexes, namespaces, and configuration — everything needed for a full restore.
Import & export
Bulk data can be moved in and out of Dakera via the memory import/export endpoints. Both use streaming JSON for efficient handling of large datasets.
# Export all memories from a namespace
curl http://localhost:3300/admin/memories/export?namespace=my-ns \
-H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
-o memories.jsonl
# Import memories from JSONL file
curl -X POST http://localhost:3300/admin/memories/import \
-H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
-H "Content-Type: application/x-ndjson" \
--data-binary @memories.jsonl
Compaction
RocksDB performs automatic compaction in the background to merge sorted runs, reclaim deleted space, and maintain read performance. Dakera also exposes a manual compaction endpoint for maintenance windows:
# Trigger manual compaction on a namespace
curl -X POST http://localhost:3300/admin/namespaces/my-ns/optimize \
-H "Authorization: Bearer $DAKERA_ROOT_API_KEY"
# {"status":"completed","duration_ms":1234,"freed_bytes":52428800}
Distributed architecture
In cluster mode, Dakera distributes data across nodes using consistent hashing. Each memory is assigned to a shard based on its namespace and ID, and shards are mapped to nodes on a virtual ring.
| Component | Mechanism | Purpose |
|---|---|---|
| Membership | SWIM gossip protocol | Nodes discover and monitor each other's health. Failure detection via probe → suspect → dead lifecycle. |
| Leader election | Lease-based with fencing tokens | One leader coordinates shard assignments and rebalancing. Monotonic tokens prevent stale leaders from acting. |
| Sharding | Consistent hashing (virtual nodes) | Data distributed evenly. Adding/removing nodes only migrates ~1/N of data. |
| Replication | Configurable replication factor | Shard replicas on multiple nodes for durability. Eventual consistency with gossip-driven convergence. |
| Rebalancing | Automatic on membership change | Leader detects node join/leave and redistributes shards. Zero-downtime migration. |
See High Availability for cluster setup, failure modes, and operational procedures.
Encryption at rest
All memory content can be encrypted with AES-256-GCM authenticated encryption. Set DAKERA_ENCRYPTION_KEY to enable. Passphrases are derived via PBKDF2-HMAC-SHA256 with 100,000 iterations. Key rotation re-encrypts all memories atomically via /admin/encryption/rotate.
Filter expressions
Metadata filters can be applied to any recall, vector query, or batch operation.
{
"filter": {
"$and": [
{ "importance": { "$gt": 0.7 } },
{ "tags": { "$in": ["decision", "blocker"] } }
]
}
}
Supported operators: $eq, $ne, $gt, $lt, $gte, $lte, $in, $nin, $and, $or, $not, $exists, $regex, $contains, $icontains, $startsWith, $endsWith, $arrayContains, $arrayContainsAll, $arrayContainsAny.