This question evaluates a data scientist's competency in product analytics, experimental design, causal inference, and launch validation by requiring definition of primary and guardrail metrics, test unit and spillover considerations, power/duration planning, behavioral signal expectations, quasi-experimental alternatives, and scale/readiness criteria. Commonly asked in Analytics & Experimentation interviews because organizations need assurance a pilot can be scaled safely and reliably, it tests practical application of experimental methods alongside conceptual understanding of statistical power, spillover risk, diagnostic checks, and operational concerns such as privacy and abuse monitoring.

Context: You are a data scientist evaluating whether a limited-market pilot of Facebook Dating is ready to scale. Design a rigorous validation plan covering metrics, experimentation, power, behavioral expectations, alternatives to RCTs, and go/no-go criteria.
Define:
Propose either a market-level rollout test (geo-ramp) or user-level randomization. Justify:
Provide a back-of-the-envelope sample size and duration plan under realistic baseline rates and minimum detectable effects (MDE). Describe:
Describe expected shapes for:
If an RCT is infeasible, propose a quasi-experimental approach (e.g., synthetic controls or staggered DiD with pre-trend checks). Detail diagnostics required before trusting the estimate.
Define quantitative and qualitative gates to expand, pause, or roll back. Include considerations for privacy readiness (e.g., DPIA/SOC 2 for any third-party services), abuse tooling, and on-call load.
Login required