PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/System Design/LinkedIn

Design a distributed key-value store

Last updated: Mar 29, 2026

Quick Overview

Design a distributed key-value store evaluates requirements, scale assumptions, API/data design, architecture, trade-offs, failure modes, and rollout in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • hard
  • LinkedIn
  • System Design
  • Machine Learning Engineer

Design a distributed key-value store

Company: LinkedIn

Role: Machine Learning Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design a distributed key–value storage service. Requirements: high availability across availability zones, horizontal scalability to billions of keys, low-latency reads/writes, per-key read-after-write consistency, durability, TTL support, conditional updates (CAS), and optional range scans. Define the API and data model. Choose a sharding strategy (consistent hashing vs range), replication model (leader–follower vs leaderless), and lay out read/write paths, quorum choices, and conflict resolution. Select on-disk structures (e.g., LSM vs B+Tree), compaction, indexing, and hot-key mitigation. Explain rebalancing, failure detection, recovery, backups, and disaster recovery. Include observability (metrics, tracing, alerts) and discuss CAP trade-offs and testing methodology.

Quick Answer: Design a distributed key-value store evaluates requirements, scale assumptions, API/data design, architecture, trade-offs, failure modes, and rollout in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Related Interview Questions

  • Review a Web Application Architecture - LinkedIn (easy)
  • Scale a Distributed Randomized Multiset - LinkedIn (medium)
  • Design a Top-K Ranking Service - LinkedIn (easy)
  • Design a Global Calendar Service - LinkedIn (medium)
  • Design a malicious-URL checking service using an isMalicious API - LinkedIn (medium)
|Home/System Design/LinkedIn

Design a distributed key-value store

LinkedIn logo
LinkedIn
Jul 16, 2025, 12:00 AM
hardMachine Learning EngineerTechnical ScreenSystem Design
5
0

Design a distributed key-value store

Design a Distributed Key–Value Store (Technical Screen)

Context

You're designing a cloud-native, multi-tenant key–value (KV) storage service for internal ML/analytics platforms. The service must support billions of keys with low-latency reads/writes and be highly available across availability zones (AZs). Some workloads require conditional updates (CAS) and per-key read-after-write consistency; others occasionally need range scans (e.g., prefix scans for model features).

Functional Requirements

  • API supports: Get, Put/Upsert, Delete, Batch, Compare-And-Set (CAS), TTL per key, and optional range scans.
  • Data model: binary values; metadata includes TTL, version, and optional attributes.
  • Per-key read-after-write consistency.
  • Optional range scans (prefix or ordered key ranges).

Non-Functional Requirements

  • High availability across AZs.
  • Horizontal scalability to billions of keys.
  • Low latencies: reads/writes p99 under tight SLOs (assume single-digit ms within an AZ when cache hits; low tens of ms on stable storage access).
  • Durability.
  • Hot-key mitigation.
  • Observability (metrics, tracing, alerts).

Design Tasks

  1. Define the API and data model, including error semantics and consistency options.
  2. Choose sharding strategy: consistent hashing vs. range partitioning; justify how to support range scans.
  3. Choose replication model: leader–follower vs. leaderless. Define read/write paths, quorum choices, and conflict resolution.
  4. Select on-disk structures (e.g., LSM vs. B+Tree), compaction strategy, indexing, TTL handling, and caching.
  5. Explain hot-key mitigation strategies.
  6. Explain rebalancing, failure detection, recovery, backups, and disaster recovery.
  7. Provide observability plan (metrics, tracing, alerting).
  8. Discuss CAP trade-offs and tunable consistency.
  9. Outline testing methodology for correctness, performance, and resilience.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
  • State explicit assumptions before making sizing or architecture decisions.
  • Prioritize the functional path first, then address reliability, security, observability, and rollout.

What a Strong Answer Covers

  • A scoped requirements summary with concrete non-goals and success metrics.
  • API, data model, architecture, consistency, capacity, and operations.
  • Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
  • A validation, monitoring, migration, and launch plan appropriate for the risk level.

Follow-up Questions

  • What breaks first at 10x traffic or data volume?
  • How would you degrade gracefully during dependency failures?
  • What metrics and alerts would prove the design is healthy after launch?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More LinkedIn•More Machine Learning Engineer•LinkedIn Machine Learning Engineer•LinkedIn System Design•Machine Learning Engineer System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.