System Design Interview: End-to-End Architecture Deep Dive
Task
Explain the end-to-end architecture of a production system you built or can credibly design. Use a concrete example (e.g., real-time personalized feed, event ingestion pipeline, payments, notifications). Cover the full request and data lifecycle: clients, APIs, services, storage, async infrastructure, and observability.
For Each Module, Discuss
-
Purpose and responsibilities.
-
How to speed up the service (latency, throughput, resource efficiency).
-
Expected QPS/EPS with back-of-the-envelope estimates and assumptions.
-
Data model, partitioning/sharding, and cache strategy.
-
Failure modes, backpressure, and fallback behavior.
-
If Kafka (or a similar log) is involved:
-
Producer, broker, and consumer configuration.
-
Delivery guarantees (at-most-once, at-least-once, exactly-once) and how they are achieved.
-
Idempotency, retries, reprocessing, DLQs, and schema evolution.
Constraints to State
-
Latency SLOs (e.g., p95 100 ms for reads; p99 for critical paths).
-
Traffic assumptions (DAU/MAU, sessions/day, requests/session, peak factor).
-
Data retention and compliance needs.
Deliverables
-
High-level architecture diagram (describe in words if you can’t draw).
-
Module-by-module walkthrough with the points above.
-
Capacity planning math (QPS, partitions, cache sizes) and key configuration choices.