All configuration via environment variables. No config file required.
Server
Variable
Type
Default
Description
DAKERA_ROOT_API_KEY
string
—
Bootstrap root API key (super_admin scope). If unset, auth is disabled.
DAKERA_HOST
string
0.0.0.0
Bind address
DAKERA_PORT
u16
3300
REST API port
DAKERA_GRPC_PORT
u16
50051
gRPC port
RUST_LOG
string
info
Log level: trace/debug/info/warn/error
Storage
Variable
Default
Description
DAKERA_STORAGE
memory
Backend: memory, filesystem, s3, minio
DAKERA_STORAGE_PATH
./data
Data directory for filesystem storage
DAKERA_S3_BUCKET
—
S3 bucket name (3–63 chars)
DAKERA_S3_REGION
—
S3 region (e.g. us-east-1)
DAKERA_S3_ENDPOINT
—
S3-compatible endpoint URL (MinIO)
AWS_ACCESS_KEY_ID
—
AWS / MinIO access key
AWS_SECRET_ACCESS_KEY
—
AWS / MinIO secret key
Embeddings
Variable
Default
Description
DAKERA_MODEL
bge-large
ONNX embedding model loaded at server startup. Values: bge-large (1024d, server default), gte-modernbert-base (768d, 8192 tokens, MTEB 64.38 — v0.11.94+), modernbert-embed-base (768d, 8192 tokens, MRL-capable — v0.11.94+), bge-small (384d), e5-small (384d), minilm (384d). Changing this mid-deployment requires re-ingesting all stored vectors.
DAKERA_MRL_DIMENSION
—
Matryoshka dimension truncation (v0.11.94+). Set to 64, 128, 256, 512, or 768 to post-pool truncate and re-normalize ModernBERT embeddings. Unset = native dimension. Non-MRL models ignore this setting.
DAKERA_INFERENCE_DEVICE
cpu
cpu or cuda
Reranker
v0.11.95+ — Default cross-encoder upgraded from Xenova/bge-reranker-base (278M, English-centric) to onnx-community/bge-reranker-v2-m3-ONNX INT8 (568M, multilingual, 51.8 nDCG@10 BEIR). The model is downloaded from HuggingFace at startup and cached in $HF_HOME.
Variable
Default
Description
DAKERA_RERANKER_MODEL
bge-reranker-base
Cross-encoder reranker model (v0.11.95+). Default: bge-reranker-base (278M params, ~49 BEIR nDCG@10). Set to bge-reranker-v2-m3 (568M params, 51.8 BEIR nDCG@10, +17.7% quality) to upgrade. Production is byte-identical when unset — default remains bge-reranker-base. Targets Cat3/temporal recall improvements.
DAKERA_RERANKER_ONNX_FILE
—
Path to a custom reranker ONNX file (v0.11.95+). Use to A/B between INT8 and FP16 quantizations of the reranker model. Unset = bundled model file.
RERANKER_POOL_SIZE
4
Number of pre-allocated ONNX cross-encoder sessions in the reranker pool. Increase for higher concurrency at the cost of memory.
RERANKER_ONNX_BATCH_SIZE
16
Mini-batch size for cross-encoder ONNX inference. Each mini-batch is padded to its own max seq_len, reducing memory waste vs full-chunk padding.
RERANKER_MAX_CONCURRENT
6
Maximum concurrent reranker calls. Requests beyond this limit receive an immediate 503 Overloaded response rather than queuing indefinitely.
Enable entity-filtered second HNSW pass merged via RRF
DAKERA_ENTITY_LINK_BOOST
—
Entity-linked memory scoring boost (v0.11.95+). Set to 1 to enable. Adds a popularity-decayed additive boost to recall candidates sharing named entities with the query — improves Cat2 (multi-hop) recall. Production is byte-identical when unset.
DAKERA_HNSW_CACHE_MAX
50
Max HNSW indexes in LRU cache
DAKERA_FUSION_STRATEGY
—
Hybrid search fusion override (v0.11.94+). rrf or minmax. Unset = MinMax (default, byte-identical). Enables live A/B comparisons without code changes.
DAKERA_QUERY_DECOMP
false
Query decomposition (v0.11.93+). When enabled, complex queries are decomposed into sub-queries and fused via RRF, improving multi-step recall.
Recall scoring weights (CE-36, v0.11.91+)
The compound recall score formula is w_vec·relevance + w_imp·importance + w_rec·exp(−Δt/τ) + w_freq·frequency. Defaults are byte-identical to pre-CE-36 behaviour.
Variable
Default
Description
DAKERA_SCORE_W_VEC
adaptive
Vector relevance weight (0–1). Unset = adaptive per query type via smart_scoring_weights().
DAKERA_SCORE_W_IMP
adaptive
Importance weight (0–1). Unset = adaptive per query type.
DAKERA_SCORE_W_REC
0.13
Recency (temporal proximity) weight (0–1).
DAKERA_SCORE_RECENCY_TAU_HOURS
168
Decay constant τ in hours for recency term. Range: 1–87600 (1 h … 10 yr). Default: 7 days.
Auth, rate limiting & observability
Variable
Default
Description
RATE_LIMIT_RPS
100
Requests per second
RATE_LIMIT_BURST
50
Burst capacity
DAKERA_AUTH_ENABLED
true
Enable API key authentication
DAKERA_CORS_ORIGINS
*
Allowed CORS origins (comma-separated)
DAKERA_TRACING_ENABLED
false
Enable OpenTelemetry tracing
OTEL_EXPORTER_OTLP_ENDPOINT
http://localhost:4317
OTLP collector endpoint
DAKERA_AUDIT_LOG
false
Enable audit logging
Observability & monitoring
Dakera exports OpenTelemetry traces and Prometheus metrics. Configure OTEL to send traces to Jaeger, Grafana Tempo, or any OTLP-compatible collector: