Context: You are designing an online controlled experiment for a new threaded comments UI on a large consumer platform. The goal is to increase weekly unique commenters without harming key user experience metrics. Assume a logged-in ecosystem with some logged-out traffic and multi-device usage.
Objectives
-
Primary objective: Increase weekly unique commenters per user by 5% relative.
-
Guardrail: Do not reduce median session length by more than 1% relative.
Tasks
(a) Define precise hypotheses for the primary and guardrail metrics, including:
-
Measurement windows and denominators.
-
Unit of randomization (user vs. session).
-
Handling cross-device users and logged-out traffic.
-
How to prevent contamination/crossover.
(b) Given: baseline weekly commenter rate = 8% per user; target relative lift = 5%; two-sided α = 0.05; power = 0.8; expected weekly user sample = 2M.
-
Outline the sample size formula and inputs you would use (no need to compute exact N).
-
State whether you would use CUPED or pre-experiment covariates to reduce variance, which covariates, and why.
(c) Propose a ramp plan with sequential monitoring that controls Type I error (e.g., group sequential or alpha spending). Describe stopping rules for efficacy/futility and how you would handle novelty effects and learning effects over time.
(d) List at least three guardrails (e.g., abuse reports, latency p95, session length) and define actionable thresholds. Explain what you would do if the primary metric improves but a guardrail degrades.
(e) If network effects (users replying across treatments) create interference, describe a cluster- or geo-based randomization design and its trade-offs versus individual-level randomization.