System Design: In-Memory Key–Value Database for Ultra–Low Latency
Context
You are designing an in-memory, per-node key–value database optimized for ultra–low-latency reads and writes. It must support point lookups, prefix scans, TTLs with automatic expiration, optional transactions with snapshot isolation, and high availability with specified durability targets.
Functional Requirements
-
Commands:
-
SET(key, value[, ttl])
-
GET(key)
-
DELETE(key)
-
MGET(keys)
-
SCAN(prefix, limit, cursor)
-
Optional:
-
Transactions with snapshot isolation (MULTI/EXEC)
-
Atomic increments (e.g., INCRBY)
-
Pub/Sub notifications on key changes
Non-Functional Requirements
-
Per-node throughput: 50k ops/s
-
Latency: p99 GET < 2 ms; p99 SET < 5 ms
-
Availability: 99.99%
-
Durability targets:
-
RPO ≤ 1 s
-
RTO ≤ 60 s on crash/restart
Sub-Questions
(a) Choose core data structures (e.g., hash table for point lookups, radix/ART or skip list for prefix scans). Explain complexity, memory overhead, and cache behavior.
(b) Provide durability: write-ahead log (WAL) and periodic snapshots; fsync policy, log compaction, and exact crash-recovery steps.
(c) Scale out: sharding via consistent hashing, leader–follower replication, read replicas, client routing, and failover. State the consistency model and how to achieve it.
(d) Manage memory: allocator strategy, fragmentation control, TTL wheel/timer, and eviction (LRU/LFU) when a memory cap is reached.
(e) Concurrency model: single-threaded event loop vs. multi-threaded; locking, batching, and pipelining trade-offs.
(f) Operations: metrics/slowlog, backups, online config changes, and capacity planning for 100M keys (avg value 200 B) with 64 GB RAM per node.