A/B Test: Active Minutes After Two Weeks
Scenario
You ran a two-week A/B experiment on a new algorithm. The primary metric is user active minutes. Assume the unit of randomization is the user, and each user is assigned to control or treatment for the full duration. You will analyze the per-user total (or average) active minutes over the two weeks.
Tasks
-
Choose the most appropriate statistical test to compare mean active minutes between control and treatment. State the assumptions and how you would validate them.
-
Calculate (show how to compute) the p-value and a 95% confidence interval for the difference in means; interpret both in business terms.
-
Explain Type I and Type II errors for this experiment, and how you would adjust for multiple comparisons if tracking five secondary metrics.
Hints
-
Consider Welch's t-test vs. non-parametric options (Mann–Whitney U, permutation test) and bootstrap CIs.
-
For multiple comparisons, consider Holm–Bonferroni or Benjamini–Hochberg.
-
Discuss power calculations and minimal detectable effect (MDE).