A/B Test Powering and Error Control (Two-Proportion Z-Test)
Context: You are planning a two-arm A/B test on sign-up conversion. The current baseline conversion rate is p0 = 0.10. You want to detect an absolute uplift of +0.01 (to p1 = 0.11) using a two-sided test.
Assumptions:
-
Equal allocation to A and B (n per arm).
-
Two-sided significance level α = 0.05, desired power 1 − β = 0.80.
-
Normal approximation (z-test) for two independent proportions.
Tasks
-
Derive the required sample size per arm to detect a 1 pp uplift (0.10 → 0.11). Show the formula and plug in the numbers.
-
You will also monitor 10 secondary metrics and must control the family-wise error rate (FWER) using Bonferroni.
-
What is the per-metric α?
-
How does this change your required sample size for detecting the same 1 pp uplift, under reasonable interpretations of FWER control?
-
If you plan to peek daily over 14 days, describe a valid sequential scheme (e.g., O’Brien–Fleming or Pocock alpha-spending) and how it alters the stopping boundaries relative to a fixed-sample analysis.
-
Explain the practical cost of a Type II error (false negative) in this test. Provide one concrete way to reduce β without increasing α, and quantify the trade-off.