Design an A/B test for pre-roll ads
Company: Twitch
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
Design and analyze an A/B test for reducing pre-roll ad frequency on mobile live streams by 20%. Answer all parts precisely.
1) Randomization: Choose the unit (viewer-level, creator-level, geo, or hybrid). Justify to minimize interference when viewers switch streams and creators share audiences. Define exposure and ensure stickiness across days.
2) Metrics: Pick a single primary metric that captures value (e.g., watch_time_per_viewer_day) and list at least three guardrails (e.g., crash_rate, rebuffer_ratio, ad_impressions_viewer, retention_day1). Explain heavy-tail handling (e.g., log-transform, winsorize at 99.5%, or quantile metrics) and how that affects inference.
3) Sample size: Assume baseline mean daily watch time = 36 minutes, standard deviation = 60 minutes per viewer-day, intra-user correlation induces a design effect of 1.3, total eligible daily mobile viewers = 5,000,000. For a two-sided α=0.05, 1−β=0.8, detect a +2% relative lift in the primary metric. Compute the required per-variant sample size in viewer-days after applying the design effect. Show formulas and a numeric answer.
4) Variance reduction: Describe how you would use CUPED with pre-experiment watch time and device to reduce variance, including the exact regression you would fit and how to apply theta.
5) Novelty and ramp: Propose a 2-week ramp with sequential monitoring that controls type I error (e.g., group sequential or alpha-spending). Specify decision boundaries or stopping rules at interim checks and how you’d adjust for peeking.
6) Integrity: Detect and mitigate bot/afk traffic and creator-led raids that could bias results. Include filters and post-stratification or cluster-robust SEs when clustering by creator/day. Explain how you’d check for spillovers and, if detected, switch to a cluster-randomized test by creator with power implications.
Quick Answer: This question evaluates experimental design and online-experimentation competencies, including randomization under interference, metric selection for business value, sample-size calculation, variance-reduction techniques (e.g., CUPED), sequential monitoring, and traffic-integrity controls in the Analytics & Experimentation domain.