This question evaluates understanding of hypothesis testing (Type I/II errors), cost-sensitive decision theory, sample size calculation for proportion tests, and multiple-testing corrections within the Statistics & Math domain for a Data Scientist role.
You are deciding whether to ship a new dispatch algorithm based on an A/B test. The business outcome of interest is gross bookings improvement. You will use a binary decision: ship or do not ship.
Assume an A/B test on a proportion metric (e.g., conversion), with equal allocation per arm.
(a) Define Type I (false positive, alpha) and Type II (false negative, beta) errors in this ship/no-ship context. Which error is worse in expectation and why?
(b) Derive the decision rule that minimizes expected loss in terms of alpha and beta. Show explicitly how p, C_FP, and C_FN enter the expression.
(c) Choose concrete values for alpha and power (1 − beta) that are consistent with your cost-sensitive rule and justify them.
(d) Using your chosen alpha and power, compute the minimum sample size per arm for a two-sided difference-in-means test on a proportion metric with baseline conversion rate = 10% and minimum detectable effect (MDE) = +1% absolute (i.e., 10% vs 11%). State any standard approximations you use.
(e) If you must simultaneously test 10 metrics, explain how your alpha/power and sample size change under:
Login required