Statistics Interview Task (Onsite)
You are evaluating a product experiment and related analytics questions. Answer precisely, showing calculations and interpretation.
(a) Experiment after 5 interim looks
-
Control: n_c = 12,000 users, x_c = 1,380 conversions
-
Treatment: n_t = 12,200 users, x_t = 1,512 conversions
Tasks:
-
Compute the two-sided p-value for equality of proportions and a 95% confidence interval (CI) for the difference in proportions p_t − p_c.
-
Adjust your inference for sequential monitoring using an O’Brien–Fleming (OBF) alpha-spending approach. If you don’t have the exact spending schedule, explain whether the result would still be significant and why.
(b) Multiple testing
You track 5 metrics with sorted p-values {0.001, 0.012, 0.019, 0.070, 0.300}. Apply Benjamini–Hochberg (BH) at FDR q = 0.05. State which metrics are discoveries, and compare to Bonferroni at familywise α = 0.05.
(c) Bot detection (Bayes)
Prevalence of bots is 5%. A model flags a user as a bot with false positive rate (FPR) = 2% and false negative rate (FNR) = 10%. If a user is flagged, what is the posterior probability they are truly a bot?
(d) Robust AOV analysis
Average Order Value (AOV) is right-skewed (mean = 32, sd = 15, heavy tail). Propose a robust approach for outlier handling and inference:
-
Compare IQR rule, z-scores, and MAD/Huber M-estimators.
-
Recommend a log-transform with a small delta and explain how to report effects back on the original scale using a smearing estimator.