This question evaluates a data scientist's mastery of A/B testing fundamentals — statistical power and sample-size calculations, effect-size and variance considerations, allocation strategies, variance-reduction methods such as CUPED, and experiment diagnostics including SRM, instrumentation audits, imbalance and segmentation checks.
You are analyzing a two-proportion (binary conversion) A/B test with independent users, no clustering/spillover, and equal exposure eligibility per day unless specified. Answer all parts concisely and show calculations where requested.
Define statistical power for a two-proportion A/B test and list the primary levers that increase power, ranking them by typical practical impact (largest to smallest). Briefly explain trade-offs. Include:
Given:
Compute:
Show formulas and numeric results.
Recompute (b) assuming CUPED achieves a 30% relative variance reduction (R² = 0.30). What is the new sample size per variant and duration?
How does switching to a 90/10 allocation (90% control, 10% treatment) affect power at fixed total traffic? Provide intuition and, if possible, a quantitative comparison to equal split.
Your test, run for the duration from (b), returns a statistically significant −2% lift (treatment worse), contrary to your prior expectation of +7%. Outline a step-by-step diagnostic plan before drawing conclusions. Include:
Propose an evidence-based decision tree for what to do next. Specify when to:
Login required