Interpreting a Two-Sided A/B Test p-value of 0.03
You ran a two-sided A/B test on conversion rate. Results:
-
Baseline (control) conversion: 5.0%
-
Observed difference (treatment − control): +0.4 percentage points (pp)
-
Pooled standard error (SE) of the difference: 0.18 pp
-
Two-sided p-value: 0.03
Explain to a product manager what a p-value of 0.03 actually means and, critically, what it does NOT mean. Use the concrete numbers above and cover:
-
Null hypothesis, test statistic, sampling distribution, and the “extremeness under the null” interpretation.
-
Why p ≠ Pr(H0|data) and why p is not the false positive rate for this single result.
-
The relationship between p-values, confidence intervals, alpha, and statistical power (and why a small p doesn’t imply a large effect).
-
How optional stopping/peeking and multiple looks inflate type I error and how to communicate this risk.
-
A non-technical 2–3 sentence explanation you would actually say to the PM, and a contrasting Bayesian framing for the same result.
Note: “pp” = percentage points (e.g., 5.0% to 5.4% is +0.4 pp).