System Design: Real-Time Recommendation ML System
Context
You are tasked with designing an end-to-end machine-learning system that serves real-time recommendations in a consumer-facing product (e.g., feed, products, videos). The system must handle high read traffic and evolving content and user behavior.
Assumptions (you may refine during the interview):
-
Traffic: ~10k QPS; p95 latency target ≤ 150 ms for recommendation API
-
Inventory: 10M items; daily new/expiring items
-
Feedback: clicks, likes, purchases; implicit and explicit signals
-
Privacy: user consent, PII minimization, right-to-erasure compliance
Requirements
Explain and justify the design for each of the following:
-
Data collection and event pipeline
-
Feature engineering and feature store (offline and online)
-
Model training, labeling, and retraining strategy
-
Online serving architecture (candidate generation, ranking, re-ranking)
-
Monitoring, alerting, and experimentation
-
Scalability, reliability, and cost considerations