Differentiate Type I vs II errors under costs

Q: Differentiate Type I vs II errors under costs

This question evaluates understanding of hypothesis testing (Type I/II errors), cost-sensitive decision theory, sample size calculation for proportion tests, and multiple-testing corrections within the Statistics & Math domain for a Data Scientist role.

Q: How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

Question

Loading...

Ship/No-Ship Decision: Type I/II Errors, Cost-Sensitive Testing, Sample Size, and Multiple Testing

Context

You are deciding whether to ship a new dispatch algorithm based on an A/B test. The business outcome of interest is gross bookings improvement. You will use a binary decision: ship or do not ship.

Beneficial threshold (effect of interest): at least +1% absolute improvement in the primary metric.
Prior probability that the true effect is beneficial (≥ +1%): p = 0.30.
Costs:
- False Positive (Type I): ship when the algorithm is not beneficial (effect < +1%); expected cost C_FP = $500,000.
- False Negative (Type II): do not ship when the algorithm is beneficial (effect ≥ +1%); expected cost C_FN = $50,000.

Assume an A/B test on a proportion metric (e.g., conversion), with equal allocation per arm.

Tasks

(a) Define Type I (false positive, alpha) and Type II (false negative, beta) errors in this ship/no-ship context. Which error is worse in expectation and why?

(b) Derive the decision rule that minimizes expected loss in terms of alpha and beta. Show explicitly how p, C_FP, and C_FN enter the expression.

(c) Choose concrete values for alpha and power (1 − beta) that are consistent with your cost-sensitive rule and justify them.

(d) Using your chosen alpha and power, compute the minimum sample size per arm for a two-sided difference-in-means test on a proportion metric with baseline conversion rate = 10% and minimum detectable effect (MDE) = +1% absolute (i.e., 10% vs 11%). State any standard approximations you use.

(e) If you must simultaneously test 10 metrics, explain how your alpha/power and sample size change under:

Family-Wise Error Rate (FWER) control via Bonferroni.
False Discovery Rate (FDR) control via Benjamini–Hochberg (BH).

Differentiate Type I vs II errors under costs

Ship/No-Ship Decision: Type I/II Errors, Cost-Sensitive Testing, Sample Size, and Multiple Testing

Context

Tasks

Solution

Comments (0)

Differentiate Type I vs II errors under costs

Overview

Ship/No-Ship Decision: Type I/II Errors, Cost-Sensitive Testing, Sample Size, and Multiple Testing

Context

Tasks

Solution

Comments (0)