Distributed Systems Fundamentals
Asked of: Software Engineer
Last updated
What's being tested
Distributed-systems coding interviews usually reduce to deterministic data-structure design under failures: partitioning, replication, ordering, retries, and load limits. The interviewer wants clean APIs, correct edge-case handling, and complexity reasoning, not broad architecture hand-waving.
Patterns & templates
-
Consistent hashing with sorted virtual nodes — use
bisect_leftover ring positions;O(log V)lookup,O(V)storage. -
Quorum reasoning — require
R + W > Nfor strong read-after-write visibility; handle ties, stale replicas, and unavailable nodes. -
Idempotency-key pattern — store
request_id -> result/status; retries return cached outcome, preventing double execution after timeout ambiguity. -
Token bucket / leaky bucket rate limiting — track
tokens,last_refill_ts;allow()isO(1), but time arithmetic must be monotonic. -
Logical clocks — use Lamport timestamps
(counter, node_id)for total tie-breaking; vector clocks detect concurrency atO(nodes)space. -
Leader election / heartbeat simulation — model
timeout,term, andlast_seen; avoid assuming synchronized clocks or instant failure detection. -
Replication log merge — compare
(term, index)or(timestamp, node_id); define deterministic conflict resolution before writing code.
Common pitfalls
-
Pitfall: Treating network timeout as failure certainty; in distributed systems, timeout means “unknown,” so retries must be safe.
-
Pitfall: Forgetting deterministic tie-breakers, causing different nodes to choose different winners for the same conflict.
-
Pitfall: Explaining CAP/Paxos abstractly instead of implementing the requested state transitions, invariants, and edge cases.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Related concepts
- Concurrency And Multithreading FundamentalsCoding & Algorithms
- Distributed Systems Correctness And IdempotencySystem Design
- Distributed Storage, Replication, and ConsistencySystem Design
- Distributed Systems Consistency And Low-Latency DesignSystem Design
- Distributed Systems Consistency, Reliability, And ObservabilitySystem Design
- Scalable Distributed System ArchitectureSystem Design