System Design: Two-Stage Recommender for a New Content App
Context
You are designing recommendations for a new content app with sparse interactions and frequent new items. The system must be robust to user and item cold-start, and meet strict latency SLAs. You will propose a two-stage recommendation system, define models, losses, sampling, features, exploration strategy, metrics, and an A/B testing plan that addresses selection bias. Finally, you will specify how to meet end-to-end latency targets.
Requirements
-
Two-stage architecture:
-
Stage 1 (Candidate Generation/Retrieval): propose concrete model(s) (e.g., matrix factorization, two-tower retrieval), loss function(s), negative sampling strategy, and how to handle user/item cold-start.
-
Stage 2 (Ranking): propose concrete model(s) (e.g., gradient-boosted trees, DNN ranker), loss function(s), negative sampling strategy, and key features.
-
Features: list the key user, item, and context features used in both stages, including any multimodal content features (text, image) relevant for new items.
-
Exploration vs. exploitation: specify the strategy (e.g., Thompson sampling, epsilon-greedy) and how you will cap propensities to control variance.
-
Evaluation:
-
Offline metrics (by stage).
-
Online KPIs.
-
A/B test design that controls for selection bias using IPS/SNIPS or doubly robust estimators. Write the estimator you would compute.
-
Guardrails for long-term outcomes.
-
Latency and serving:
-
Show how to meet p50 < 50 ms and p99 < 200 ms end-to-end latency.
-
Include feature store access patterns, caching, ANN retrieval, and fallback strategies.