Design two systems. You can assume a large-scale production environment; focus on clear APIs, data models, scaling, reliability, and trade-offs.
Part A) Design an asynchronous Job/Task system (service-oriented)
Design a service that lets clients submit background jobs and later query results.
Requirements
-
Clients can create jobs and poll/subscribe for status.
-
Jobs move through well-defined states (e.g., pending/running/succeeded/failed/canceled).
-
Support retries with exponential backoff.
-
Ensure idempotency (no duplicate execution for the same logical request).
-
Choose storage (RDBMS vs NoSQL) and justify.
Discuss explicitly
-
API design and state transitions
-
Worker model / scheduling
-
Retry + backoff, dead-letter handling
-
Idempotency strategy
-
Observability (metrics/logging/tracing)
Part B) Design a high-throughput cache system (Redis-like / in-memory layer)
Design a caching layer in front of a database to reduce latency and increase throughput.
Requirements
-
Very high QPS, low latency reads, concurrent reads/writes.
-
Cache key design, eviction (LRU/LFU/TTL).
-
Multi-node scaling and sharding.
-
Cache consistency and invalidation strategy; explain trade-offs vs DB consistency.
-
Handle hot keys.
-
High availability and performance under failures.
Discuss explicitly
-
Consistency model and write/read paths
-
Invalidation strategies and failure modes
-
Sharding/replication and rebalancing
-
Hot key mitigation and rate limiting
-
Capacity planning and SLOs