Ratio Metric Variance
Asked of: Data Scientist
Last updated

What's being tested
Ability to reason about uncertainty when your metric is a ratio (e.g., CTR = clicks / impressions): recognize sources of variance, choose an appropriate estimator, and justify inference method (delta method, bootstrap, GLM, or Fieller).
Core knowledge
- Delta method: Var(X/Y) ≈ (1/μY^2)Var(X) − (2μX/μY^3)Cov(X,Y) + (μX^2/μY^4)Var(Y).
- Binomial CTR (per-impression): if impressions fixed, Var(p̂)=p(1−p)/N; not valid when exposures vary by user.
- Per-user aggregation reduces heterogeneity: compute per-user ratios or use ratio-of-means vs mean-of-ratios tradeoff.
- Bootstrap (user-level) provides robust SEs when assumptions fail; resample users, not impressions, to preserve dependency.
- Fieller’s theorem gives exact CIs for normal numerator/denominator; useful when denominator variance isn't negligible.
- Modeling alternative: logistic/binomial GLM with exposure (offset) yields direct SEs for treatment effect.
- Watch denominators near zero: delta method breaks down; transform (log) or use bootstrap/Fieller instead.
Worked example — "Estimate variance of CTR (clicks/impressions) in an A/B test"
First decide the aggregation unit: use per-user clicks and impressions (avoid mixing impression-level dependence). Compute sample means μ̂X, μ̂Y, variances and covariance across users. Apply the delta-method formula to get Var(CTR̂) ≈ (1/μ̂Y^2)Var(X) − (2μ̂X/μ̂Y^3)Cov(X,Y) + (μ̂X^2/μ̂Y^4)Var(Y). As a robustness check, run a user-level bootstrap to confirm SE and CI; if bootstrap and delta disagree or denominator is small/heterogeneous, switch to GLM (binomial with logit and offset=log(impressions)) or report Fieller CI.
A common pitfall
Using p(1−p)/N with N = total impressions while users have varying impressions or clicks clustered by user. That underestimates SE because it ignores between-user variance and Cov(clicks,impressions). Similarly, treating numerator and denominator as independent or averaging per-impression metrics (mean-of-ratios) instead of ratio-of-means leads to biased inference.
Further reading
- Efron & Tibshirani, "An Introduction to the Bootstrap" (1994) — user-level bootstrap techniques.
- Casella & Berger, "Statistical Inference" — delta method and Fieller’s theorem discussion.