This question evaluates a data scientist's competency in experimental design, variance reduction techniques, and regression-based covariate adjustment (CUPED/ANCOVA) within randomized A/B tests, including understanding of statistical power and diagnostic checks.

You are running a randomized A/B test with outcome Y. You also have a pre-period covariate X (measured before any treatment exposure) that explains 36% of the variance in Y (i.e., R² = 0.36) when regressing Y on X.
Answer the following:
(a) Derive how CUPED (a.k.a. regression/ANCOVA adjustment using a pre-period covariate) changes the variance of the treatment effect estimator Δ, and quantify the expected sample-size savings and MDE change given R² = 0.36.
(b) List assumptions and failure modes that would invalidate or undermine CUPED (e.g., post‑treatment leakage, mis‑timed covariates), and diagnostics you would run to detect problems.
(c) For this setting, would you prefer stratified randomization or covariate adjustment, and why?
Login required