High Availability

Dakera supports production-grade multi-node clustering with gossip-based membership (SWIM protocol), lease-based leader election, consistent-hash sharding, configurable replication, and automatic rebalancing. Nodes discover each other via seed addresses and coordinate through fencing-token-protected leases. Cold-tier data is stored on shared S3-compatible object storage.

Cluster environment variables

Variable	Default	Description
`DAKERA_CLUSTER_MODE`	`false`	Enable multi-node cluster
`DAKERA_NODE_ID`	—	Unique node identifier (e.g. `node-1`)
`DAKERA_CLUSTER_SEEDS`	—	Comma-separated gossip bootstrap addresses: `dakera-2:7946,dakera-3:7946`
`DAKERA_GOSSIP_PORT`	`7946`	SWIM gossip port — must be open between nodes
`DAKERA_GOSSIP_BIND`	`0.0.0.0:7946`	Gossip bind address (useful for multi-NIC hosts)
`DAKERA_API_ADVERTISE`	auto	Advertised API address for client routing
`DAKERA_REDIS_URL`	—	Redis URL for distributed L1.5 cache (`redis://redis:6379`)

Failure modes & recovery

Understanding how the cluster handles failures is critical for operations. Here's what happens in each scenario:

Failure	Impact	Recovery	Downtime
Leader failure	Shard reassignments paused, writes still accepted by owning nodes	Automatic re-election via lease expiry	~5 seconds
Follower failure	Shards on that node unavailable until recovery	Other nodes continue serving. Data re-replicated from replicas	Zero for unaffected shards
Network partition	Split-brain risk	Fencing tokens prevent stale leaders from writing. Quorum required for election	Partition duration
Full cluster restart	All nodes down	Bootstrap from S3/MinIO cold storage. WAL replay restores last state	Startup time (~10-30s)

Split-brain prevention

Dakera prevents split-brain using fencing tokens — monotonically increasing integers assigned to each leader lease. When a leader's lease expires and a new leader is elected, the new leader receives a higher fencing token. Any operations from a stale leader with a lower token are rejected by other nodes.

Why odd node counts matter: With 3 nodes, a majority quorum requires 2 nodes — a single node failure is tolerated. With 5 nodes, 3 nodes are needed for quorum — two failures tolerated. Even node counts (2, 4) provide no better fault tolerance than N-1 and waste resources.

Replication

Data is replicated across nodes based on the replication factor. Each shard's data exists on multiple nodes for durability. The consistency model is eventual consistency with gossip-driven convergence — writes are acknowledged after the primary shard confirms, and replicas converge within milliseconds under normal network conditions.

Replication factor	Nodes required	Fault tolerance
1 (no replication)	1+	None — node loss means data loss for that shard
2	3+	1 node failure per shard
3 (recommended)	5+	2 node failures per shard

Scaling procedures

Adding a node

# 1. Start a new node with seed addresses pointing to existing cluster
docker run -d \
  --name dakera-4 \
  -e DAKERA_CLUSTER_MODE=true \
  -e DAKERA_NODE_ID=node-4 \
  -e DAKERA_CLUSTER_SEEDS=dakera-1:7946,dakera-2:7946 \
  -e DAKERA_ROOT_API_KEY=$DAKERA_ROOT_API_KEY \
  -p 3303:3300 -p 7949:7946 \
  ghcr.io/dakera-ai/dakera:latest

# 2. Verify node joined the cluster
curl -H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
  http://localhost:3300/admin/cluster/nodes
# Node 4 appears with state: "Alive"

# 3. Automatic shard rebalancing begins — monitor progress
curl -H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
  http://localhost:3300/admin/cluster/status

Removing a node

# 1. Enable maintenance mode (drains traffic and migrates shards)
curl -X POST http://dakera-4:3300/admin/maintenance/enable \
  -H "Authorization: Bearer $DAKERA_ROOT_API_KEY"

# 2. Wait for drain to complete
curl http://dakera-4:3300/admin/maintenance/status \
  -H "Authorization: Bearer $DAKERA_ROOT_API_KEY"
# {"status":"drained","shards_remaining":0}

# 3. Stop the node
docker stop dakera-4 && docker rm dakera-4

# 4. Remove from seed list in remaining nodes (optional — gossip handles it)

Rolling upgrade procedure

Upgrade cluster nodes one at a time with zero downtime:

Step	Command	What happens
1. Drain	`POST /admin/maintenance/enable`	Traffic redirected, shards migrated to other nodes
2. Verify	`GET /admin/maintenance/status`	Confirm `shards_remaining: 0`
3. Upgrade	`docker pull && docker restart`	Pull new image, restart container
4. Health	`GET /health`	Confirm new version healthy
5. Rejoin	`POST /admin/maintenance/disable`	Node rejoins cluster, shards rebalanced back
6. Repeat	—	Move to next node

Docker Compose cluster

The dakera-deploy repo includes a multi-node Docker Compose configuration. A 3-node cluster with Traefik load balancer, Redis cache, and shared MinIO storage:

# Clone and start a 3-node cluster
git clone https://github.com/dakera-ai/dakera-deploy
cd dakera-deploy/ha

# Generate secrets and start
cp .env.example .env  # edit with your keys
docker compose up -d

# Verify cluster health
curl -H "Authorization: Bearer $DAKERA_ROOT_API_KEY" \
  http://localhost:3100/admin/cluster/status

Kubernetes HA

Use the Helm chart with dakera.replicaCount set to 3 or more. The chart automatically configures gossip seeds, Redis, and shared storage:

helm install dakera oci://ghcr.io/dakera-ai/dakera-helm/dakera \
  --namespace dakera --create-namespace \
  --set dakera.replicaCount=3 \
  --set dakera.rootApiKey="$(openssl rand -hex 32)" \
  --set minio.rootPassword="$(openssl rand -hex 16)"

Monitoring HA health

Key metrics to watch in a clustered deployment:

Metric	Alert threshold	Meaning
`dakera_gossip_members`	< expected node count	A node has left the cluster — investigate immediately
`dakera_replication_lag_ms`	> 1000ms	Replicas falling behind — check network or disk I/O
`dakera_shard_balance`	> 0.3 skew	Uneven shard distribution — trigger manual rebalance
`dakera_leader_lease_remaining_ms`	< 5000ms	Leader lease about to expire — potential re-election
`dakera_forwarded_requests`	high ratio	Many requests hitting wrong node — check load balancer config

Performance characteristics

Operation	Single node	Clustered (3 nodes)	Notes
Local read	<5ms	<5ms	Read hits owning node directly
Forwarded read	—	~8-12ms	Request routed to correct shard owner
Write	<10ms	~15-20ms	Primary ack + async replication overhead
Recall (hybrid)	~30ms	~35-45ms	Minimal overhead — search is shard-local

Gossip tuning

For most deployments, the default gossip settings work well. Tune these for high-latency or large clusters:

Variable	Default	When to adjust
`DAKERA_GOSSIP_PORT`	`7946`	Port conflict with another service
`DAKERA_GOSSIP_BIND`	`0.0.0.0:7946`	Multi-NIC hosts — bind to specific interface

Network requirements: Gossip uses both TCP and UDP on the configured port. Ensure both protocols are allowed between cluster nodes. Latency between nodes should be <50ms for optimal failure detection.

Disaster recovery

For cross-region disaster recovery:

Backup to remote S3 — schedule daily backups to an S3 bucket in a different region via the admin API (see Deployment → Backup & Restore).
Full cluster restore — provision new nodes, start with DAKERA_STORAGE=s3 pointing to the backup bucket, and upload the backup bundle via /admin/backups/restore.
RTO expectation — cluster bootstrap + restore: 5-15 minutes depending on data size.
RPO expectation — equal to backup frequency (daily = up to 24h data loss). For lower RPO, increase backup frequency or use cross-region S3 replication on the primary bucket.

Admin endpoints

Endpoint	Description
`GET /admin/cluster/status`	Cluster health, leader node, shard distribution
`GET /admin/cluster/nodes`	All nodes with state (Alive/Suspect/Dead), roles, shard assignments
`POST /admin/cluster/rebalance`	Trigger manual shard rebalancing
`POST /admin/maintenance/enable`	Drain node for rolling upgrades
`POST /admin/maintenance/disable`	Rejoin node after maintenance
`GET /admin/maintenance/status`	Check drain progress

Production checklist

Odd replica counts — run 3 or 5 nodes for clean quorum math
Dedicated S3/MinIO — do not co-locate cold storage on Dakera nodes
External load balancer — place outside the Dakera node pool
Monitor all nodes — scrape /metrics on every node; alert on gossip membership count and replication lag
Firewall gossip port 7946 — restrict to cluster nodes only (TCP + UDP)
Rolling upgrades — use maintenance mode to drain nodes one at a time
Cross-region backups — replicate to a separate S3 bucket for disaster recovery
Test failover — periodically stop a node and verify automatic recovery

Deployment → Troubleshooting → dakera-deploy repo ↗