Design rigorous A/B test and causal analysis
Company: Pinterest
Role: Data Scientist
Category: Statistics & Math
Difficulty: hard
Interview Round: Onsite
Answer all parts with formulas, numeric results, and assumptions:
A) Sample size: Baseline conversion p0=0.045, target MDE=+7% relative (p1=0.045*1.07), two-sided alpha=0.05, power=0.90. Compute per-variant sample size for a standard two-proportion z-test. Show the z-scores used and the pooled variance assumption.
B) Duration: With 1.2M daily visitors, 60/40 traffic split (A/B), and 80% eligibility, how many calendar days are required to reach the sample size from (A)? State any adjustments for repeat visitors and overlap with other experiments.
C) Variance reduction: If a pre-experiment covariate has R^2=0.20 with the outcome, quantify the effective MDE or sample-size reduction when using CUPED. Explain when CUPED increases bias (e.g., covariate shift).
D) Sequential testing: You plan daily peeks for 21 days. Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien-Fleming). Specify spending function and the final critical z. Explain pros/cons vs always-valid sequential methods (SPRT/e-values).
E) Interference and clustering: When randomizing by user causes cross-unit spillovers, propose a cluster design (e.g., geo or traffic-bucket). Compute design effect for ICC=0.02 with average cluster size m=5 and m=50. How does this change the sample size?
F) SRM check: On day 3 you observe 110,000 users in A and 90,000 in B (expected 60/40 from eligible 200,000). Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant?
G) Causal inference: The team ran an observational study with a strong pre-period trend. Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions (e.g., exclusion restriction for IV; continuity for RDD), and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity to unobserved confounding).
Quick Answer: This question evaluates a data scientist's competency in experimental design, sample-size and power calculations, variance-reduction methods (e.g., CUPED), sequential testing and alpha spending, clustering and interference effects, SRM checks, and causal identification strategies such as DID, IV, and RDD within the Statistics & Math domain.