Self-Hosted AI Memory Server
A private AI memory engine that runs entirely on your infrastructure. Single Rust binary, on-device embeddings, AES-256-GCM encryption. Your data never leaves your machines.
Why Self-Host Your AI Memory?
- Data sovereignty — memories containing sensitive user data, business logic, and proprietary context stay on your servers
- No external API dependencies — ONNX inference runs locally; no calls to OpenAI, Cohere, or any embedding service
- Regulatory compliance — GDPR, HIPAA, SOC 2 compatible by design when you control the infrastructure
- Predictable costs — no per-query fees, no token charges, no surprise bills
How It Works
Deploy the Binary
Choose Docker, Kubernetes (Helm), or systemd. A single binary runs the full memory engine with REST (port 3300) and gRPC (port 50051).
# Docker
docker run -d -p 3300:3300 -p 50051:50051 \
-v dakera_data:/data \
ghcr.io/dakera-ai/dakera:latest
# Or download the binary
curl -fsSL https://get.dakera.ai | sh
dakera serve
Configure Encryption & Access
All data is encrypted with AES-256-GCM at rest. Create scoped API keys for each agent or service.
Connect Your Agents
Use the Python, TypeScript, Go, or Rust SDK — or connect via 83 MCP tools for Claude Desktop, Cursor, and Windsurf.
from dakera import Dakera
client = Dakera(base_url="http://localhost:3300", api_key="dk-...")
# Store a memory
client.memory.store(
content="User requires HIPAA-compliant data handling",
namespace="compliance",
metadata={"importance": 1.0}
)
# Recall with hybrid retrieval
results = client.memory.recall(
query="What compliance requirements exist?",
namespace="compliance",
top_k=5
)
Deployment Options
| Method | Command | Best For |
|---|---|---|
| Docker | docker run ghcr.io/dakera-ai/dakera:latest | Development, single-node production |
| Kubernetes | helm install dakera dakera/dakera | Scalable production clusters |
| systemd | dakera serve | Bare-metal, VPS, edge deployment |
Security Architecture
| Layer | Implementation |
|---|---|
| Encryption at Rest | AES-256-GCM with user-controlled keys |
| Access Control | Scoped API keys per namespace/agent |
| Rate Limiting | Configurable per-key request limits |
| Network | Bind to localhost by default; TLS termination at your reverse proxy |
| Isolation | Namespace-level data isolation, multi-agent separation |
Frequently Asked Questions
Self-hosted AI memory means your agent memory engine runs entirely on your own infrastructure. No data is sent to third-party clouds. Dakera runs as a single Rust binary with on-device embedding inference via ONNX, so even vector generation happens locally.
Dakera encrypts all memory data at rest using AES-256-GCM. API access is controlled via scoped API keys with rate limiting. You maintain full control of your encryption keys.
Dakera can be deployed via Docker (ghcr.io/dakera-ai/dakera:latest), Kubernetes with Helm charts, or as a systemd service using the standalone binary. It exposes REST on port 3300 and gRPC on port 50051.
No. Dakera uses on-device ONNX inference with models like MiniLM, BGE, and E5. All embedding generation happens locally without external API calls.
Deploy Private AI Memory in Minutes
Single binary, zero cloud dependencies, full data sovereignty.
Get Started