A/B Test Design and Power Analysis
Asked of: Data Scientist
Last updated

-
What it is A/B test design is how you plan an online experiment: who to randomize, what to measure, and how to analyze outcomes. Power analysis is the math that tells you how much traffic or time you need to reliably detect a minimum effect you care about.
-
Why interviewers ask about it At companies like Meta, shipping fast while avoiding regressions depends on running sensitive, trustworthy experiments. Strong candidates can translate product goals into a valid design, pick an MDE that matches business value, and use power analysis and variance reduction to shorten test duration without inflating false positives.
-
Core ideas to know
- Power = 1 − β; choose α, β, baseline, and MDE to compute required sample size.
- Randomize and analyze at the same unit (user, session, cluster); check for interference and spillovers.
- Predefine primary metric(s) and guardrails; control multiple testing if you monitor many metrics.
- Use SRM checks; big allocation imbalances signal logging or assignment bugs.
- Variance reduction (e.g., CUPED/ANCOVA) lowers required N by using pre-experiment covariates.
- Avoid “peeking” with fixed-sample tests; if you must monitor early, use sequential/alpha-spending designs.
- For triggered features, choose ITT vs. “triggered” analysis carefully; misalignment can bias or waste power.
-
A common pitfall Candidates often reverse-engineer sample size from whatever traffic they have, then claim the test is “powered.” Interviewers expect you to set MDE from business impact (e.g., +0.2 pp conversion = +$X/day), then compute required N and timeline—and say no if it’s infeasible. Another trap is ignoring interference or the wrong unit (e.g., pageview randomization for a social graph effect), which invalidates the test. Finally, many “wins” disappear because of peeking or multiple comparisons without correction; describe safeguards you’d use in production.
-
Further reading
- Trustworthy Online Controlled Experiments (Kohavi, Tang, Xu) — the industry handbook on design, metrics, pitfalls, and platform practices. Cambridge University Press. (cambridge.org)
- Microsoft Research: Deep Dive Into Variance Reduction (CUPED) — clear explanation of using pre-experiment data to increase sensitivity and shorten tests. (microsoft.com)
- Amazon Science: Leveraging covariate adjustments at scale in online A/B testing — modern, large-scale perspective on regression adjustment to improve power. (assets.amazon.science)