Dakera is a self-hosted AI agent memory engine — a single Rust binary with hybrid vector + full-text retrieval, on-device embeddings, knowledge graphs, memory decay, and 14 core MCP tools (86+ available via profiles). It scores 88.2% on the LoCoMo benchmark and requires zero external dependencies.

What languages and SDKs does Dakera support?

Native SDKs for Python, TypeScript, Go, and Rust. Plus REST API (JSON) and gRPC (Protobuf) for any language. MCP protocol for AI tool integration.

v0.11.102 · 88.2% LoCoMo · 1,540 questions · standard eval, no LLM post-processing

ذاكرة · Dhākira · Arabic for memory

Self-hosted memory
for AI agents

Q: Is Dakera a vector database?

No. Dakera is an AI agent memory platform with persistent, session-aware, cross-agent memory and intelligent importance decay. The retrieval engine (HNSW, IVF, BM25, hybrid search) is how memories are recalled — the product is agents that remember, not a database you query.

Q: Do I need an OpenAI API key for embeddings?

No. Text is embedded automatically using built-in models (MiniLM, BGE, E5) powered by ONNX Runtime. No external calls, no additional cost.

Q: Is Dakera self-hostable?

Yes. Single binary or Docker image with zero external runtime dependencies. Your data never leaves your infrastructure. Pull the image, set DAKERA_API_KEY, and you're live in under a minute.

Q: How does Dakera handle scaling?

A single instance handles millions of vectors. For horizontal scaling, Dakera supports distributed clustering with Raft consensus, consistent-hash sharding, and automatic rebalancing.

Q: What is Dakera's pricing model?

The self-hosted binary has no usage fees. Dakera Cloud (managed hosting, SLA, team monitoring) is coming — join the waitlist to lock in founder pricing.

Q: Is Dakera open source or open core?

Open core. SDKs for Python, TypeScript, Go, Rust, the CLI, and MCP server are MIT-licensed on GitHub. The memory engine (Rust server binary) is proprietary. Self-host with full data control, no phone-home, no usage fees.

Persistent, searchable, cross-agent memory that runs entirely on your infrastructure. Vector + BM25 hybrid retrieval, knowledge graphs, built-in embeddings — no external API calls, no cloud dependency. Pull the binary, set one env var, done.

Get Started Free → Try Playground

88.2% LoCoMo accuracy Sub-10ms queries 14 core MCP tools One binary · zero deps

Self-hosted is free & live — no email needed. This is for Dakera Cloud: managed hosting + SLA. Lock in founder pricing before launch. No spam.

✓ You're on the list. We'll reach out personally before launch.

REST API

# Store agent memory
POST /v1/memory/store
{
  "agent_id": "assistant-1",
  "content": "User prefers TypeScript and dark mode",
  "importance": 0.9
}

# Recall by meaning — semantic search
POST /v1/memory/recall
{
  "agent_id": "assistant-1",
  "query": "user preferences",
  "top_k": 5
}

// → { "score": 0.97, "content": "User prefers TypeScript..." }

from dakera import DakeraClient

client = DakeraClient("dk-your-key")

# Store agent memory
client.store_memory(
    agent_id="assistant-1",
    content="User prefers TypeScript and dark mode",
    importance=0.9
)

# Recall by semantic meaning
results = client.recall(
    agent_id="assistant-1",
    query="user preferences",
    top_k=5
)
# → Memory(score=0.97, content="User prefers TypeScript...")

import { DakeraClient } from '@dakera-ai/dakera';

const client = new DakeraClient('dk-your-key');

// Store agent memory
await client.storeMemory({
  agentId: 'assistant-1',
  content: 'User prefers TypeScript and dark mode',
  importance: 0.9,
});

// Recall by semantic meaning
const results = await client.recall({
  agentId: 'assistant-1',
  query: 'user preferences',
  topK: 5,
});
// → [{ score: 0.97, content: 'User prefers TypeScript...' }]

use dakera_rs::{Client, StoreRequest, RecallRequest};

let client = Client::new("dk-your-key");

// Store agent memory
client.store_memory(StoreRequest {
    agent_id: "assistant-1".into(),
    content: "User prefers TypeScript and dark mode".into(),
    importance: 0.9,
}).await?;

// Recall by semantic meaning
let results = client.recall(RecallRequest {
    agent_id: "assistant-1".into(),
    query: "user preferences".into(),
    top_k: 5,
}).await?;
// → Vec<Memory> { score: 0.97, ... }

import dakera "github.com/dakera-ai/dakera-go"

client := dakera.New("dk-your-key")

// Store agent memory
client.StoreMemory(ctx, dakera.StoreRequest{
    AgentID:    "assistant-1",
    Content:    "User prefers TypeScript and dark mode",
    Importance: 0.9,
})

// Recall by semantic meaning
results, _ := client.Recall(ctx, dakera.RecallRequest{
    AgentID: "assistant-1",
    Query:   "user preferences",
    TopK:    5,
})
// → []Memory{ {Score: 0.97, Content: "..."} }

LoCoMo benchmark

Core MCP tools (86+ via profiles)

Native SDKs

Releases · v0.11.102

Works with

Windsurf

The problem

Your agents forget
everything they learn

Every session starts from zero. Thousands of interactions, zero retained knowledge. You're paying to re-teach your agents the same things over and over.

Sessions are isolated silos

Each conversation starts blank. Your agent can't recall what it learned yesterday, last week, or across 10,000 prior interactions.

Knowledge evaporates at scale

Insights from thousands of users vanish after each session. Your agent never compounds intelligence — it stays perpetually naive.

Context stuffing is a dead end

Cramming history into prompts burns tokens, inflates costs, and hits a hard ceiling. It's duct tape, not architecture.

agent session

agent.recall("user preferences")

Error: No memory found.

Context window empty.

agent.sessions

1,847 sessions completed

0 memories persisted

agent.monthly_cost

$4,200/mo on context stuffing

0 knowledge retained

Retention

$50k

Wasted / year

Capabilities

Everything agents need
to remember

Six core capabilities that turn stateless AI into agents with genuine, compounding memory.

Vector + Hybrid Search

Find memories by meaning, not just keywords. HNSW, BM25, and hybrid search with temporal re-ranking and tunable weights.

API reference →

Persistent Agent Memory

Store, recall, consolidate, and forget. Four memory types — episodic, semantic, procedural, working — with automatic importance decay.

Core concepts →

Built-in Embeddings

Text is auto-embedded on store and query. No OpenAI calls, no external APIs. HuggingFace models ship inside the binary.

How embeddings work →

MCP Native (14 core tools)

Drop into Claude, Cursor, or Windsurf instantly. 14 core memory tools loaded by default — 86+ available via profiles for power users. Set DAKERA_MCP_PROFILE=power|admin|all to unlock more.

MCP setup guide →

Knowledge Graph

Automatically connects related memories into a queryable graph. Entity extraction, similarity edges, cluster summaries, and semantic deduplication.

How it works →

Dashboard + CLI

Visual admin dashboard for exploring memories, running queries, and monitoring agents. Plus a full dk CLI for automation.

CLI reference →

Framework & Tool Integrations

Plug into the frameworks
you already use

Dakera ships native integrations for every major agent framework. Five lines of code to add persistent memory to your existing LangChain, LlamaIndex, CrewAI, or AutoGen pipeline.

LangChain

Python · langchain-dakera

Drop-in DakeraMemory and DakeraVectorStore classes. Your chain gets persistent cross-session memory with semantic recall in three lines.

pip install langchain-dakera

LlamaIndex

Python · llama-index-dakera

Dakera-backed VectorStore for LlamaIndex pipelines. Server-side embeddings mean zero OpenAI dependency for your RAG index.

pip install llama-index-dakera

CrewAI

Python · crewai-dakera

Give your CrewAI agents a shared long-term memory store. Agents recall each other's findings across tasks — your crew compounds knowledge instead of starting fresh every run.

pip install crewai-dakera

AutoGen

Python · autogen-dakera

Persistent memory across multi-agent AutoGen conversations. Each agent has its own memory namespace — shared recall, isolated writes. No conversation resets between runs.

pip install autogen-dakera

MCP Protocol

Claude · Cursor · Windsurf

14 core MCP tools (86+ available via profiles) available natively. One line in your IDE config and Claude, Cursor, or Windsurf gets persistent memory across every session — zero code changes required.

config-only · no code

REST API & gRPC for any language.
Native SDKs for Python, TypeScript, Go, Rust.

All integrations →

# pip install langchain-dakera dakera from langchain_dakera import DakeraMemory from langchain.chains import ConversationChain from langchain_openai import ChatOpenAI # Persistent memory backed by Dakera — survives process restarts memory = DakeraMemory( api_url="http://localhost:3300", agent_id="my-assistant", recall_k=5, ) chain = ConversationChain(llm=ChatOpenAI(), memory=memory) # → Memory persists across restarts. Agent remembers every prior conversation.

// .mcp.json — add to Claude Desktop, Cursor, or Windsurf { "mcpServers": { "dakera": { "command": "dakera-mcp", "env": { "DAKERA_URL": "http://localhost:3300", "DAKERA_API_KEY": "dk-your-key", "DAKERA_MCP_PROFILE": "core" } } } } # → 14 core tools default. Set DAKERA_MCP_PROFILE=power|admin|all for more.

# pip install crewai-dakera from crewai_dakera import DakeraStorage from crewai import Crew, Agent, Task # Shared memory across your entire crew — every agent reads what others stored storage = DakeraStorage(api_url="http://localhost:3300") crew = Crew( agents=[researcher, writer], tasks=[research_task, writing_task], memory=storage, ) # → writer agent recalls researcher's findings — no repeated tool calls

Already running a pipeline? Add Dakera memory in under 5 minutes.

Deploy in 5 min →

SDKs

Integrate in minutes

Native SDKs for Python, TypeScript, Go, and Rust. Plus REST and gRPC for everything else. Five lines to first memory.

Store & Recall

Semantic memory with automatic embedding and importance scoring

Session Lifecycle

Context persists across every conversation automatically

Multi-Agent

Isolated namespaces for hundreds of agents at once

MCP Ready

14 core tools (86+ available via profiles) for Claude, Cursor, and Windsurf

from dakera import DakeraClient

client = DakeraClient(
    base_url="http://localhost:3300",
    api_key="your-key"
)

# Store agent memory
client.memories.store(
    agent_id="assistant-1",
    content="User prefers TypeScript",
    importance=0.9
)

# Recall by meaning
memories = client.memories.recall(
    agent_id="assistant-1",
    query="language preferences",
    top_k=5
)

import { DakeraClient } from "@dakera-ai/dakera"

const client = new DakeraClient({
  baseUrl: "http://localhost:3300",
  apiKey: "your-key"
})

await client.memories.store({
  agentId: "assistant-1",
  content: "User prefers TypeScript",
  importance: 0.9
})

const memories = await client.memories.recall({
  agentId: "assistant-1",
  query: "language preferences",
  topK: 5
})

import "github.com/dakera-ai/dakera-go"

client := dakera.NewClient(dakera.Config{
    BaseURL: "http://localhost:3300",
    APIKey:  "your-key",
})

client.Memories.Store(ctx, dakera.StoreMemoryRequest{
    AgentID:    "assistant-1",
    Content:    "User prefers TypeScript",
    Importance: 0.9,
})

memories, _ := client.Memories.Recall(ctx, dakera.RecallRequest{
    AgentID: "assistant-1", Query: "language preferences", TopK: 5,
})

use dakera_client::{DakeraClient, Config, StoreMemoryRequest};

let client = DakeraClient::new(Config {
    base_url: "http://localhost:3300".into(),
    api_key:  "your-key".into(),
    ..Default::default()
});

// Store agent memory
client.memories().store(StoreMemoryRequest {
    agent_id:   "assistant-1".into(),
    content:    "User prefers TypeScript".into(),
    importance: 0.9,
    ..Default::default()
}).await?;

// Recall by meaning
let memories = client
    .memories().recall("assistant-1", "language preferences", 5)
    .await?;

# Store memory
curl -X POST localhost:3300/v1/memory/store \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-key" \
  -d '{"agent_id":"assistant-1","content":"User prefers TS","importance":0.9}'

# Recall
curl -X POST localhost:3300/v1/memory/recall \
  -H "Authorization: Bearer your-key" \
  -d '{"agent_id":"assistant-1","query":"language preferences","top_k":5}'

# Text search with auto-embedding
curl -X POST localhost:3300/v1/namespaces/docs/query-text \
  -H "Authorization: Bearer your-key" \
  -d '{"text":"semantic search systems","top_k":5}'

Architecture

Five Rust crates. One binary.

6 index algorithms, 3 storage tiers, built-in ML inference, and a production-grade API layer — compiled into a single deployable artifact. Designed for low-latency, high-throughput retrieval.

Rust crates

Index algorithms

GC pauses (Rust)

Single binary

Storage tiers

dakera-api

6 components

Production-grade REST & gRPC API layer with authentication, observability, and rate control

AxumTonicAuthPrometheusOpenTelemetry

REST API

Full CRUD with batch upsert, multi-namespace support, and streaming responses. Axum-based with tower middleware.

JSON + CBOR

gRPC

High-performance binary protocol via Tonic. Bi-directional streaming for real-time indexing and search operations.

Protobuf v3

Auth & API Keys

Multi-tenant token authentication with per-key permissions, namespace isolation, and configurable RBAC policies.

Rate Limiting

Per-key sliding window rate limiting with burst allowance. Configurable per endpoint, per namespace, or globally.

Audit Logging

Structured JSON operation logs with request tracing, latency breakdown, and compliance-ready event stream.

Prometheus + OTel

Built-in /metrics endpoint with histogram latencies, request counts, and distributed tracing via OpenTelemetry SDK.

Pull + Push

dakera-engine

12 components

Six index algorithms, hybrid search, auto-index selection, and distributed clustering with Raft consensus

HNSWIVFSPFreshBM25HybridRaft

HNSW

Hierarchical navigable small-world graph for sub-millisecond approximate nearest neighbor queries at scale.

ANN search

IVF

Inverted file index with configurable nprobe for high-throughput batch indexing with tunable recall trade-offs.

Batch optimized

SPFresh

Real-time streaming index optimized for continuous ingestion. LSMT-inspired design with background compaction.

Streaming ingest

PQ + SQ

Product quantization (4-16 sub-vectors) and scalar quantization for 8-32x memory compression with minimal recall loss.

8-32x compression

BM25

Full-text keyword search with configurable k1/b parameters, stemming, stop words, and multi-language tokenization.

Hybrid Search

Reciprocal Rank Fusion (RRF) combining vector similarity and keyword relevance into a single ranked result set.

RRF fusion

Auto-Index

Analyzes dataset characteristics (cardinality, dimensionality, distribution) and selects the optimal index strategy.

Agent Memory

Importance-weighted memory with consolidation, decay scoring, and semantic deduplication for AI agent workflows.

Knowledge Graph

Entity-relationship graph with typed edges, traversal queries, and automatic relationship extraction from text.

Gossip Protocol

SWIM-based protocol for cluster membership, failure detection, and metadata propagation across nodes.

Protocol: SWIM

Leader Election

Raft-based consensus for partition leader assignment, log replication, and automatic failover with quorum writes.

Raft consensus

Sharding

Consistent hashing with virtual nodes for automatic data distribution and rebalancing across cluster members.

dakera-inference

6 components

Rust-native ML embedding pipeline with ONNX Runtime — no Python, no external dependencies

ONNXMiniLMBGEE5MetalCUDA

ONNX Runtime

Cross-platform ML inference runtime. ONNX model loading, hardware acceleration (CPU/CUDA/Metal), no Python runtime needed.

Pure Rust

MiniLM-L6

384-dim embeddings optimized for speed. Ideal for real-time agent memory with low-latency requirements.

384 dims · 22M params

BGE-Small

BAAI General Embedding for high-accuracy semantic search. Strong semantic retrieval at 33M params — BEIR MTEB benchmarks confirm competitive retrieval accuracy.

384 dims · 33M params

E5-Small

Microsoft's E5 model with instruction-tuned embeddings. Excellent for query-document asymmetric search patterns.

384 dims · 33M params

Batch Processing

Dynamic batching with configurable batch size and timeout. Amortizes model overhead for bulk ingestion workloads.

Up to 64 per batch

CPU / CUDA / Metal

Automatic hardware detection with Metal on macOS, CUDA on Linux/Windows, and optimized AVX2/NEON CPU fallback.

Auto-detect

dakera-storage

9 components

Three-tier persistence engine — hot memory, warm filesystem, cold S3 — with WAL durability and background compaction

MemoryFilesystemS3WALSnapshotsCompaction

Memory Tier

Lock-free concurrent hashmap with arena allocation. Sub-microsecond reads for hot data and active agent sessions.

Sub-µs reads

Filesystem Tier

Memory-mapped file storage with LSM-tree compaction. Handles datasets larger than RAM with predictable tail latency.

mmap + LSM

S3 / MinIO

Cloud object storage backend for cold data archival. Automatic tiering moves data down based on access frequency.

Auto-tier

Write-Ahead Log

Append-only WAL with fsync durability guarantees. Crash recovery replays log to reconstruct consistent state.

fsync durable

Snapshots

Point-in-time consistent snapshots with copy-on-write semantics. Export to local disk or stream directly to S3.

Compaction

Background merge of sorted runs with configurable size ratios. Reclaims space from tombstones and overwrites.

Delta Encoding

Stores only vector deltas for versioned data. Reduces storage by 40-70% for frequently updated embeddings.

40-70% savings

TTL

Per-record and per-namespace time-to-live with lazy expiration. Background sweeper reclaims expired entries.

Encryption at Rest

AES-256-GCM encryption for filesystem and S3 tiers. Key rotation support with zero-downtime re-encryption.

AES-256-GCM

dakera-common

6 components

Shared type system, error taxonomy, configuration, and cross-crate utilities used by all other crates

TypesErrorsConfigSerdeValidation

Shared Types

Strongly-typed domain models for vectors, memories, namespaces, and search results. Zero-cost serde serialization.

Error Taxonomy

Hierarchical error types with context propagation, HTTP status mapping, and structured error responses for clients.

Configuration

Layered config from defaults → TOML → env vars → CLI flags. Hot reload for runtime-tunable parameters.

Hot reload

Validation

Input validation with dimension checks, UTF-8 enforcement, payload size limits, and custom constraint rules.

Serialization

Zero-copy deserialization with serde. Supports JSON, CBOR, MessagePack, and custom binary format for vectors.

Zero-copy

Telemetry

Shared tracing subscriber with span propagation, structured logging (JSON + pretty), and metric type definitions.

Ecosystem

MCP Server · 14 core tools CLI · dk Dashboard · Leptos Python SDK TypeScript SDK Go SDK Rust SDK

How it works

Three steps to persistent intelligence

From raw conversation to compounding knowledge — your agent's memory grows with every interaction.

Store

Your agent stores conversations, decisions, and preferences as embedded memories — each with an importance score and type label. Embeddings happen automatically inside the binary.

memory.store("User prefers TypeScript", importance=0.9)

Auto-embeddingImportance scoring4 memory types

Recall

Before each response, the agent retrieves the most relevant memories — combining vector similarity, keyword matching, and graph traversal into a single ranked result.

memory.recall("language preferences", top_k=5)

Hybrid searchLow-latencyGraph traversal

Learn

Over time, overlapping memories merge automatically. Importance decays, facts deduplicate, and related concepts connect. Your agent builds compounding intelligence — not a growing pile of text.

memory.consolidate("agent-1", strategy="merge")

Auto-consolidationImportance decayDeduplication

Use Cases

What developers build
with persistent memory

From solo agent projects to production multi-agent pipelines — here's exactly what becomes possible when your agents remember.

Agents with persistent memory

A customer support agent learns user preferences on session 1 and applies them on session 100 — without any re-prompting. Dakera stores, recalls, and consolidates knowledge across every conversation automatically.

Session memory Auto-consolidation

Multi-agent knowledge pipelines

A researcher agent stores findings to Dakera; a writer agent recalls them by meaning in the next task — no context passing, no redundant tool calls. Your crew shares a living knowledge base, not just message history.

Shared namespace CrewAI AutoGen

RAG with decay-weighted recall

A research assistant surfaces fresh sources and deprioritizes stale ones — automatically. Dakera's decay engine reduces the importance of old memories over time, so your retrieval stays relevant without manual curation.

Importance decay Hybrid search

Chatbots that remember preferences

A product chatbot recalls that this user prefers detailed explanations, dislikes upsells, and last asked about billing — across sessions, weeks apart. Personalization that compounds without any prompt engineering.

User profiles Cross-session recall

Copilots that learn your workflows

A developer copilot that knows your codebase naming conventions, your team's architecture decisions, and which library patterns you actually use — accumulated silently from every session via MCP. No onboarding docs, no prompt files.

MCP tools IDE native

Team memory for LLM dev tools

Your internal LLM tooling accumulates institutional knowledge — decisions made, patterns adopted, incidents resolved. New engineers query the same memory store that experienced ones have been building. Onboarding as a side-effect of usage.

Knowledge graph Multi-tenant

Who builds with Dakera

Built for the engineers who
ship production agents

Not a tool for demos. Dakera is built for developers who are deploying intelligent agents into production and need real infrastructure underneath.

AI / ML Engineers

Building production agent pipelines

You ship LangChain or AutoGen agents that need to remember state across thousands of sessions — without duct-taping Redis, Pinecone, and a custom decay script together.

One binary replaces your entire memory stack — vector store, embeddings, session store, knowledge graph

88.2% LoCoMo recall accuracy with hybrid retrieval you can tune per query type

Native integrations: langchain-dakera, crewai-dakera, autogen-dakera — drop-in memory classes

Backend Engineers

Adding memory to LLM features

You're adding an AI feature to an existing product and need a reliable memory layer — not a research project. You care about latency, auth, multi-tenancy, and zero new infra to maintain.

REST API + gRPC: integrate from any language in under an hour, no Python runtime required

Namespace isolation per user, key-based auth, and rate limiting built-in — production-ready on day one

Designed for low-latency retrieval — Rust with zero GC pauses won't bottleneck your LLM call chain

Framework Builders

Integrating LangChain, CrewAI, or LlamaIndex

You build tools on top of agent frameworks and need a memory backend that works across all of them — consistent API, framework-agnostic, and fast enough for tool-calling loops.

Identical REST/gRPC API across Python, TypeScript, Go, and Rust SDKs

MCP protocol for LLM tool integration — same backend serves IDE, API, and framework use cases

Open core: integrate the public API surface without worrying about vendor lock-in on internals

Platform Teams

Deploying agent infrastructure at scale

You run the platform that dozens of internal teams build agents on. You need multi-tenancy, observability, horizontal scaling, and security posture — not a managed service with opaque pricing.

One instance serves hundreds of agents — namespaced, rate-limited, and AES-256-GCM encrypted

Prometheus metrics + OpenTelemetry tracing out of the box — plug into your existing stack

Raft consensus clustering: add nodes, data rebalances automatically — no manual sharding

Your team. Your infrastructure. No managed service required.

Read the docs → Deploy in 5 min ↗

Architecture

One binary.
Everything included.

Most memory setups require assembling multiple services. Dakera ships embeddings, vector indexing, knowledge graph, and session storage in a single Rust binary — zero external dependencies required.

Typical setup

Vector store

Embedding service

Knowledge graph

Session store

~1–2 GB · 3–5 services

Dakera

dakera

~44 MB · 1 binary

LoCoMo Benchmark

88.2%

Long-context memory accuracy — standard industry evaluation across 1,540 questions

Dakera scores 88.2% on the full LoCoMo dataset (50 sessions, 1,540 questions) without LLM post-processing — the standard benchmark for long-context agent recall across temporal, multi-hop, entity, and implicit reasoning.

Full benchmark results → Methodology

Capability	Built in
Runtime	Rust, single binary
Embedding models	ONNX Runtime — on-device, no API calls
Index algorithms	HNSW, IVF, SPFresh, BM25, Hybrid
MCP server	14 core tools (86+ via profiles), native
Knowledge graph	Built-in, auto-extraction
Tiered storage	Memory → Filesystem → S3/MinIO
External dependencies	Zero

Open core

Open at the edges.
Closed at the core.

We open everything you need to integrate. We keep what makes us fast.

Open — MIT Licensed

Python SDK dakera-py

pip install dakera

TypeScript SDK dakera-js

npm install @dakera-ai/dakera

Go SDK dakera-go

go get github.com/dakera-ai/dakera-go

Rust SDK dakera-rs

Add dakera to Cargo.toml

CLI dakera-cli

Shell-scriptable admin and query interface

MCP Server dakera-mcp

14 core tools (86+ available via profiles) for Claude, Cursor, Windsurf

All repos on GitHub →

Closed — Proprietary

Memory Engine dakera

The Rust server: HNSW+BM25 hybrid retrieval, importance decay, knowledge graphs, AES-256 encryption, Raft clustering. Provided as a binary and Docker image. Source is not public.

Dashboard dakera-dashboard

Web UI for monitoring agents, sessions, memory health, and real-time analytics. Proprietary.

You can self-host the engine. The binary is yours to run on your own infrastructure — no phone-home, no external dependencies. What's closed is the source code, not your right to deploy it.

Full breakdown →

Limited early access · Founding pricing

The self-hosted binary is free today.
Cloud is coming — reserve your spot.

Dakera runs on your infra now — MIT-licensed, no waitlist, zero cloud dependency. Dakera Cloud brings managed hosting, uptime SLA, team dashboards, and priority support. Founding members lock in pricing that never changes.

No credit card · Self-host free today · Unsubscribe anytime

FAQ

Common
questions

Everything you need to know about Dakera. Can't find what you're looking for? Open an issue on GitHub.

Open an Issue

Is Dakera a vector database?

No. Dakera is an AI agent memory platform. It gives your agents persistent, session-aware, cross-agent memory with intelligent importance decay. The underlying retrieval engine (HNSW, IVF, BM25, hybrid search) is just how memories are recalled fast — the product is agents that remember, not a database you query.

Do I need an OpenAI API key for embeddings?

No. Text is embedded automatically on store and query using built-in models (MiniLM, BGE, E5) powered by ONNX Runtime. No external calls, no additional cost.

Is Dakera production-ready?

Yes. WAL durability, snapshots, AES-256-GCM encryption, multi-tenant auth, rate limiting, Prometheus, and OpenTelemetry are all included. Designed for production from day one.

Can I use it with Claude, Cursor, or Windsurf?

Yes. Dakera ships as a native MCP server with 14 core tools (86+ available via profiles). Add it to your Claude Desktop config, Cursor settings, or Windsurf configuration. Your AI assistant gets persistent memory across all sessions — zero code changes required.

What does Dakera replace in my stack?

Dakera replaces the entire memory infrastructure layer: vector store, embedding service, knowledge graph, and session store — all compiled into one binary. No Docker Compose, no external API keys, no separate services to operate.

How does Dakera handle scaling?

A single Dakera instance handles millions of vectors comfortably. For horizontal scaling, Dakera supports distributed clustering with Raft consensus, consistent-hash sharding, and automatic rebalancing. Add nodes — the data redistributes automatically.

What languages and SDKs are supported?

Native SDKs for Python, TypeScript, Go, and Rust. Plus a REST API (JSON) and gRPC (Protobuf) for any other language. MCP protocol for AI tool integration. Five lines of code to store your first memory.

Is Dakera open source or open core?

Open core, not fully open source. The integration layer — SDKs (Python, TypeScript, Go, Rust), CLI, and MCP server — is MIT-licensed and on GitHub. The memory engine is proprietary: you self-host the binary on your own infrastructure with full data ownership, but the engine source is not public. Commercial-friendly: no usage fees on self-hosted deployments. Full breakdown →

Is Dakera self-hostable?

Yes. Dakera ships as a single binary or Docker image with zero external runtime dependencies. Your data never leaves your infrastructure. Pull the image, set DAKERA_API_KEY, and you're live in under a minute on any Linux server or Kubernetes cluster.

When will Dakera be generally available?

Dakera is live in public alpha. You can deploy the self-hosted binary today — pull the Docker image, set an API key, and you're running in under a minute. Dakera Cloud (managed hosting, SLA, team monitoring) is coming next — join the waitlist below to lock in founder pricing.

What's the pricing model?

The self-hosted binary is free to run on your own infrastructure — no usage fees, no call-home. Dakera Cloud (managed hosting, SLA, team monitoring) is priced separately — join the waitlist to lock in founder pricing before public launch.

Is Dakera an MCP memory server?

Yes. Dakera is a self-hosted MCP memory server with 14 core tools loaded by default (86+ available via profiles) — store, recall, hybrid search, knowledge graph, sessions, decay control, namespaces, and consolidation. Connect it to Claude Desktop, Cursor, or Windsurf with one config block. Set DAKERA_MCP_PROFILE=power or all to unlock additional tools. No code changes required. All memory stays on your infrastructure.

Engineering blog