Design an ads ranking ML system
Company: Snapchat
Role: Machine Learning Engineer
Category: ML System Design
Difficulty: medium
Interview Round: Onsite
## Prompt
You are designing an **ads ranking** system for a large consumer app (feed/search entry point). For each request, the system receives a user context and a set of eligible ads/candidates and must return a ranked list of ads.
### Requirements
- Primary business goal: maximize long-term value (e.g., revenue) while maintaining good user experience.
- Must handle multiple optimization targets such as **CTR**, **CVR**, and **expected revenue**.
- Latency budget: tens of milliseconds for the ranking stage (assume candidate generation is separate).
- Strong emphasis on ML aspects: **feature design**, **training data**, **modeling choices**, and **multi-task learning**.
### Questions to answer
1. What features would you build (user/ad/context/cross features) and how would you generate and keep them fresh?
2. How would you define labels and build training data given bias from the existing ranker (position bias / selection bias)?
3. Propose a ranking model and explain how you’d combine multiple objectives (e.g., CTR + CVR + value). If using **multi-task learning**, describe the architecture and loss.
4. How would you evaluate the model offline and online? What key metrics and guardrails would you use?
5. How would you handle practical issues: cold start, delayed conversions, distribution shift, and feature leakage?
Quick Answer: This question evaluates a candidate's ability to design a low-latency ads ranking machine learning system, including feature engineering and freshness, training-data construction under position and selection bias, multi-objective modeling and multi-task architectures, and production issues like cold start, delayed conversions, distribution shift, and feature leakage. It is commonly asked in ML System Design interviews for Machine Learning Engineer roles to assess trade-offs between long-term business metrics and user experience, evaluation and guardrails for offline and online experiments, and both conceptual understanding and practical application in engineering and modeling decisions.