Design a distributed key-value store
Company: LinkedIn
Role: Machine Learning Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
Design a distributed key–value storage service. Requirements: high availability across availability zones, horizontal scalability to billions of keys, low-latency reads/writes, per-key read-after-write consistency, durability, TTL support, conditional updates (CAS), and optional range scans. Define the API and data model. Choose a sharding strategy (consistent hashing vs range), replication model (leader–follower vs leaderless), and lay out read/write paths, quorum choices, and conflict resolution. Select on-disk structures (e.g., LSM vs B+Tree), compaction, indexing, and hot-key mitigation. Explain rebalancing, failure detection, recovery, backups, and disaster recovery. Include observability (metrics, tracing, alerts) and discuss CAP trade-offs and testing methodology.
Quick Answer: This question evaluates system-design competency in distributed storage and networking, covering sharding, replication, consistency models, storage engine choices, latency and availability trade-offs, and operational concerns like rebalancing and observability.