Design and analyze ads A/B test this week
Company: Capital One
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Take-home Project
You manage an online ads platform testing a new ad scheduling policy (B) versus status quo (A). Today is 2025-09-01. The planned readout covers the last 7 days ending today: 2025-08-26 to 2025-09-01 inclusive.
Constraints and context:
- Users can be exposed across multiple platforms (web, iOS, Android) and time slots; some users see multiple impressions per day.
- Primary KPI candidate: watch_time_per_impression (seconds). Guardrails: CTR, skip_rate, daily_active_users, and complaint_rate.
- Known seasonality by day-of-week and time-of-day; some creatives are long-form videos.
- Offline conversions (site visits) arrive with a 24–48h delay.
Write a precise test plan and analysis procedure that addresses the following, with justifications:
1) Randomization unit and exposure control: user-level, device-level, or impression-level? How will you cap exposures and prevent cross-contamination across platforms and time slots? State your hashing/bucketing key.
2) Stratification and variance reduction: specify strata (e.g., platform × day-of-week × time-slot) and whether you will use CUPED with a pre-period (give exact pre-period dates). Define the CUPED covariate and show the adjusted estimator formula.
3) Metric definitions: formalize the primary KPI and guardrails (numerators/denominators), and state whether you will analyze on an intent-to-treat basis. Handle zeros and outliers (e.g., winsorization rules) explicitly.
4) Tail choice and test: state whether your hypothesis warrants one-tailed or two-tailed testing for the primary KPI and why. Choose an appropriate test (e.g., Welch’s t, stratified difference-in-means, or permutation) and show the test statistic you will use.
5) Sample size and stopping: compute or outline the calculation for required sample size per variant for a +5% relative lift in mean watch_time_per_impression given a baseline mean of 42s and SD of 55s, alpha=0.05 (two-sided) and power=0.8. Describe any sequential monitoring rule (e.g., always-valid methods) if you intend interim looks.
6) Readout: define the exact 95% CI you will report, how you will pool across strata, and how you will adjust for multiple guardrails (e.g., Holm). Include at least two diagnostic checks for randomization balance and two for seasonality/novelty effects.
7) Sensitivity: describe how you would re-run the analysis if offline conversions are incomplete for the last 48 hours, and how that affects the readout window.
Quick Answer: This question evaluates experimental design and applied statistical analysis skills for A/B testing, covering randomization, exposure control, stratification, variance reduction methods, metric formalization, hypothesis testing, sample sizing, sequential monitoring, diagnostics, and handling delayed offline conversions.