This question evaluates understanding of A/B test interpretation, statistical power and confidence intervals, asymmetric loss-aware decision-making, and experimental design adjustments in two-sample hypothesis testing.

You ran a two-sample A/B test on a primary mean metric (two-sided t-test). The original design targeted α = 0.05 and 80% power for a minimum detectable effect (MDE) of +0.6 units, assuming a baseline mean of 10.0 and standard deviation σ = 5.0.
After 14 days, you observe an estimated treatment effect Δ = +0.35 with a 95% confidence interval (CI) of [−0.05, +0.75].
(a) Why can this result be non-significant despite having a large sample size (N)?
(b) Should you compute post‑hoc power? If not, what should you report instead, and why?
(c) Suppose your loss function values false positives (FP) at twice the cost of false negatives (FN). What is your decision now, and what are your next steps (extend sample, reduce variance, or stop)?
(d) How would you update the experimental design (MDE, variance reduction plan, duration/traffic) for the next iteration?
Login required