Scenario
Design an end-to-end search system for a consumer product (e.g., an e-commerce marketplace or content platform) where users type queries and expect relevant, personalized results.
Requirements
Functional
-
Given a query string, return a ranked list of results (items/documents/videos/etc.).
-
Support:
-
Keyword matching (lexical search).
-
Semantic matching (synonyms/paraphrases via embeddings).
-
Filters/sorts (category, price range, recency, etc.).
-
Autocomplete/suggestions (optional but preferred).
-
Handle frequent content updates (new/edited documents should become searchable quickly).
Non-functional (assume reasonable targets)
-
Latency: p95 < 200 ms for the online request path.
-
Scale: 10k QPS peak; corpus size ~100M documents.
-
Reliability: graceful degradation if ML components fail.
-
Observability: logging for debugging, offline training, and online evaluation.
What to cover
-
High-level architecture (offline pipelines + online serving).
-
Retrieval (candidate generation) and ranking.
-
Indexing strategy and freshness.
-
Model training data, labels, evaluation metrics.
-
Personalization and cold-start.
-
A/B testing and launch plan.