Design and evaluate Snap recommendations
Company: Snapchat
Role: Technical Program Manager
Category: Product Design & Strategy
Difficulty: medium
Interview Round: Onsite
You are interviewing for a Technical Program Manager, ML Platform role at Snap.
Explain the key ML evaluation and experimentation concepts a TPM should understand: recall, ROC curve, AUC, p-value, hypothesis testing, and Type I/Type II errors.
Then outline the end-to-end ML lifecycle for a Snap content recommendation service, including major system components, stakeholders, launch metrics, and post-launch monitoring.
### Constraints & Assumptions
- The recommendation service should balance user satisfaction, content quality, safety, latency, and infrastructure cost.
- A TPM does not need to invent every model, but should connect ML metrics to launch decisions and operational reliability.
- Discuss offline evaluation and online experimentation separately.
- Include stakeholders such as product, ML engineering, data science, infra, trust and safety, and creators.
### Clarifying Questions to Ask
- Which recommendation surface is in scope: Spotlight, Discover, Stories, lenses, or another feed?
- What is the primary product goal: retention, watch time, creator ecosystem health, or safety?
- What latency and cost constraints exist?
- What guardrails must be met before launch?
### What a Strong Answer Covers
- Correct definitions of recall, ROC, AUC, p-value, hypothesis testing, Type I error, and Type II error.
- Why offline metrics do not guarantee online success.
- ML lifecycle from problem framing through data, features, training, evaluation, serving, experimentation, launch, and monitoring.
- Candidate generation, ranking, filtering, feature store, model registry, deployment, rollback, and observability.
- Launch metrics and guardrails for user experience, safety, latency, cost, drift, and data quality.
### Follow-up Questions
- How would you explain AUC to a non-technical stakeholder?
- What if offline AUC improves but retention drops online?
- How would you diagnose model drift?
- How would you decide whether an experiment is underpowered?
Quick Answer: Prepare a Snap ML Platform TPM answer on recall, ROC, AUC, p-values, hypothesis testing, Type I/II errors, and the end-to-end lifecycle for a content recommendation service.