Recover causal effect without a control group
Company: Pinterest
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Onsite
An intern launched an A/B experiment but forgot to allocate a control; all eligible users received Treatment for 5 days (T period). You have 4 weeks of pre-period data (P period) with the same eligibility rules and stable product. Primary metric: 1-day retention; guardrails: crashes/session, latency p95, purchase conversion.
Tasks:
1) Propose and compare at least two identification strategies to estimate the treatment effect using observational methods: (a) Pre-post with CUPED; (b) Synthetic control via matching/propensity-score weighting (PSW) against ineligible-but-similar users or delayed-exposure users; (c) Difference-in-differences using a holdout geography. For (b), specify covariates, overlap checks, and diagnostics (SMD, eCDF, weight trimming).
2) State the assumptions required for each method (e.g., parallel trends, no interference, ignorability) and design falsification/placebo tests to probe them.
3) Explain how you would compute ATT vs ATE, handle calendar effects and novelty/seasonality, and quantify uncertainty (cluster-robust SEs or bootstrap under weighting).
4) List pitfalls of the original A/B setup that led to this failure and propose a prevention plan (exposure checks, invariant metrics, automated power and allocation validation).
Quick Answer: This question evaluates a candidate's competence in causal inference, observational estimation, and experiment analytics—specifically identification strategies, causal assumptions, validation/placebo tests, diagnostics, and uncertainty quantification after an accidental full-rollout.