A/B change in cancellation rate (before vs after)
Context: You are evaluating a small product tweak intended to reduce cancellations. Treat each trip request as a Bernoulli outcome (1 = canceled, 0 = not canceled) to start.
Observed data:
-
Before: n1 = 50,000 requested trips, cancel rate p1 = 7.2%.
-
After: n2 = 48,000 requested trips, cancel rate p2 = 6.6%.
Tasks:
-
Hypothesis test and CI
-
Choose an appropriate test for the change in cancellation rate. State H0 and HA.
-
Compute a 95% confidence interval for the absolute rate difference Δ = (p_after − p_before) and interpret it in practical terms.
-
Adjustment for confounding (city, hour-of-day)
-
Specify a logistic regression that adjusts for city and hour-of-day, and include one interaction you deem important.
-
Write the model formula, list key assumptions, and describe how you would check calibration and overdispersion.
-
Power and MDE
-
What per-arm sample size is required to detect an absolute change of 0.5 percentage points at α = 0.05 with 80% power (ignore clustering first)?
-
Then discuss how an intraclass correlation (ICC) of 0.02 at the driver level inflates the required N under a cluster-robust design. Show the design effect and the new N.
-
Nonparametric check (bootstrap)
-
Describe a bootstrap procedure to form a CI for the rate difference, stratified by city (stratified resampling).
-
Explain when this CI might be preferable to the z-approximation.