Decide and justify product metrics amid trade-offs
Company: Snowflake
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Onsite
You are launching a new 'Smart Sort' ranking for a content feed that is expected to increase relevance but may reduce short-term ad impressions. Choose a primary success metric and a set of guardrail metrics, then design an experiment and a decision framework that handles conflicting metric movements.
Sub-questions:
1) Metric choice: Given candidates {7-day retention, session minutes per DAU, revenue per DAU, CTR, creator supply health}, pick a single primary metric and 2–3 guardrails. Justify each choice using statistical properties (variance, sensitivity to bots/heavy tails, stability across weekdays, susceptibility to Simpson’s paradox). Define precise formulas and aggregation levels (per-user vs. per-session) and whether to use winsorization or log transforms.
2) Experiment design: Outline the A/B design including unit of randomization (user vs. session), expected interference risks for ranking changes, how you’d mitigate them (e.g., sticky bucketing, ghost ads), and the test duration. Compute the required sample size for an MDE of 1.0% relative change in the primary metric at 90% power and alpha=0.05; state the inputs you’d need and how you’d use CUPED or stratification to reduce variance. Assume baseline is measured over 2025-08-18 to 2025-08-31 and the test runs ending by 2025-09-01.
3) Decision framework: If primary metric is up +0.8% but a guardrail (e.g., creator payout per impression) is down −2.5% with p<0.05, describe a principled trade-off method (e.g., LTV delta combining engagement and revenue, constrained optimization, or multi-metric decision rules). Include how you’d treat novelty effects, ramp strategies, sequential monitoring, and heterogeneous treatment effects across countries.
4) Post-launch: Define the on-call dashboards and alert thresholds you’d set for the first week after a 10% rollout.
Quick Answer: This question evaluates proficiency in product metric selection, statistical experiment design, causal inference, and post-launch monitoring for feed-ranking trade-offs, assessing competencies in metrics engineering, A/B testing, variance reduction techniques, and decision-framework reasoning within the Analytics & Experimentation domain for a Data Scientist role. It is commonly asked to gauge a candidate's ability to justify metric choices under competing business objectives, design defensible randomized experiments, and specify monitoring and alerting strategies, and it tests both practical application (hands-on experiment setup and monitoring) and conceptual understanding (statistical properties, bias, heterogeneity, and trade-off reasoning).