A/B Test for Rare Weekly Cancellations — Planning, Testing, and Monitoring
You are testing whether a product change affects weekly cancellation probability in a streaming service. Cancellations are rare and binary (canceled vs. not). Baseline weekly cancellation probability is p0 = 0.003; you want to detect an increase to p1 = 0.0035 with two-sided alpha = 0.05 and power = 0.80 using equal allocation.
Perform the following:
-
Test choice
-
Decide whether to use a normal-approximation z-test, Fisher's exact test, or a mid-p variant, and justify using expected counts criteria.
-
Sample size (z-test)
-
Compute/outline the required sample size per arm for the two-sample z-test for proportions:
-
Without continuity correction (CC).
-
With a CC.
-
CUPED variance reduction
-
Show how using a continuous pre-period covariate (e.g., watch-hours) via CUPED changes the variance and effective sample size. Derive the adjustment using the factor 1 − R^2.
-
Confidence intervals
-
Provide an exact or conservative 95% confidence interval for:
-
The risk difference (p1 − p0).
-
The relative risk (p1 / p0).
State clearly how to construct these intervals from observed counts.
-
Interim monitoring
-
If you perform 4 weekly interim looks (including the final), specify an alpha-spending function and give approximate adjusted critical values at each look.
-
Bayesian alternative
-
Explain when a Bayesian Beta–Binomial model would be preferable. Specify reasonable priors and a sequential stopping rule (e.g., stop for harm if P(Δ > 0) > 0.95, where Δ = p_treat − p_control), and how to compute it.