Recommender And Ranking System Design
Asked of: Data Scientist
Last updated

-
What's being tested
Ability to design scalable, low-latency recommendation/ranking pipelines that balance short‑term engagement and long‑term value, using appropriate algorithms, metrics, and evaluation (offline vs online). Expect tradeoffs across candidate generation, scoring, feature freshness, and experimentation. -
Core knowledge
- Typical pipeline: candidate generation → coarse scoring → fine-grained ranking → re-ranking/filters.
- Algorithms: two‑tower (DSSM), matrix factorization/ALS, BPR, pairwise ranking, GBDTs, neural ranking (YouTube DNN).
- Key metrics: precision@k, recall@k, NDCG, MRR, CTR, DAU/retention, and calibration/position bias adjustments.
- Exploration vs exploitation: contextual bandits, Thompson sampling, epsilon‑greedy for serendipity and cold start.
- Engineering constraints: 100–300ms budget, embedding table memory, feature freshness, incremental model updates.
- Bias correction: propensity scoring, inverse propensity weighting, debiasing for logged feedback.
- A/B testing: power, slicing, guardrail metrics, and online counterfactual estimators (IPW, doubly robust).
-
Worked example — "Design a social feed recommender"
Start by scoping: user scale, latency budget, and primary metric (e.g., 7‑day retention vs immediate CTR). Sketch pipeline: retrieval (user/item embeddings, interest graph) → lightweight scorer to reduce candidates → heavy neural ranker with cross features → business/IA filters. Enumerate features (recent activity, social graph, content embeddings, time decay), offline metrics (NDCG, offline CTR), and online strategy (A/B tests vs bandit experiments). Finally list operational constraints: embedding storage, incremental retraining, and safe-fail experiments. -
A common pitfall
Candidates often optimize immediate engagement (CTR) without modeling long‑term outcomes, producing clickbait and retention degradation. Another tempting error is treating offline AUC as proxy for online impact—ignoring position/exposure bias and distributional shift from training logs. Always tie objectives to business/long‑term user value and plan debiasing and online validation. -
Further reading
- Covington, Adams, and Sargin, "Deep Neural Networks for YouTube Recommendations" (RecSys 2016) — production two‑stage candidate+ranking architecture.
- Rendle, "BPR: Bayesian Personalized Ranking" (UAI 2009) — pairwise loss for implicit feedback ranking.
Related concepts
- Recommender, Ranking, And Ads Systems
- Ranking, Recommender, And Personalization Systems
- Recommender And Ranking SystemsMachine Learning
- Ranking, Recommendation, And Feedback SystemsML System Design
- Recommender, Ranking, And Ads ML Systems
- Recommender Systems, Feed Ranking, And Marketplace MetricsMachine Learning