This question evaluates a data scientist's ability to choose and justify a primary product metric, reason about distributional properties and guardrail metrics, and design end-to-end A/B tests for a short-video recommender feed.

You are a data scientist preparing metrics and an A/B test plan to launch a new short‑video recommender feed within Instagram. The goal is to measure whether the ranking algorithm improves user value and is safe to ship.
(a) Which single metric would you use to evaluate the recommendation system’s success, and why?
(b) Sketch or describe the expected distribution of that metric, labeling the median, mode, and 95th percentile.
(c) Metric A rises while Metric B falls—what do you do?
(d) List the end‑to‑end A/B‑testing steps for this launch.
Login required