Design a Real-Time Suggestions Service
System Design: Real-Time Typeahead/Autocomplete
Context
Design a real-time typeahead/autocomplete service for a consumer-facing web and mobile application. Users see suggestion updates on each keystroke. Assume global traffic, multiple locales, and both anonymous and signed-in users.
Requirements
Design the system and cover:
-
API design and request/response schemas
-
Data modeling and indexing (e.g., prefix trees, FST, inverted indexes)
-
Ranking signals and personalization
-
Latency targets and budgets (e.g., P95 < 100 ms)
-
Caching layers (edge, in-memory, distributed)
-
Scalability and capacity planning
-
Freshness and backfill pipelines (batch + streaming)
-
A/B experimentation hooks and telemetry
-
Failure handling and graceful degradation
State assumptions where needed and justify trade-offs.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
-
State explicit assumptions before making sizing or architecture decisions.
-
Prioritize the functional path first, then address reliability, security, observability, and rollout.
What a Strong Answer Covers
-
A scoped requirements summary with concrete non-goals and success metrics.
-
API, data model, architecture, consistency, capacity, and operations.
-
Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
-
A validation, monitoring, migration, and launch plan appropriate for the risk level.
Follow-up Questions
-
What breaks first at 10x traffic or data volume?
-
How would you degrade gracefully during dependency failures?
-
What metrics and alerts would prove the design is healthy after launch?