Ad-Ranking A/B Test: Interpreting Heterogeneous CTR Lifts
Context
You ran a standard A/B experiment for a new ad-ranking algorithm. The primary metric is CTR (clicks ÷ impressions). The experiment shows:
-
Overall lift: +5% relative CTR
-
Specific segment (Indian males, age 18–24): +100% relative CTR
Assume randomization at the user level, with typical ad auction dynamics and repeated exposures per user.
Questions
-
Hypotheses: What could explain an overall +5% lift while a specific demographic shows +100%?
-
Statistical validity: How would you validate that the segment lift is statistically significant and not due to random noise or confounding?
-
Pre-rollout diligence: What additional metrics, slices, and checks would you examine before a global rollout?