Estimate Portal’s causal lift on video-call usage
Company: Meta
Role: Data Scientist
Category: Statistics & Math
Difficulty: Medium
Interview Round: Technical Screen
Define and estimate the causal effect of purchasing a Meta Portal on users’ video‑calling behavior. Today is 2025-09-01, and only a small fraction purchase Portal.
1) Estimand: choose a primary outcome (e.g., weekly call minutes on any device per user) and state the causal estimand (ATE on treated buyers). Specify pre/post windows relative to each buyer’s purchase date, e.g., pre: [−28, −1] days; post: [+1, +28] days.
2) Observational design: propose a staggered‑adoption difference‑in‑differences/event‑study with user and time fixed effects. Write the regression specification, list assumptions (parallel trends, no anticipation), and describe pre‑trend and placebo checks.
3) Selection control: design a propensity model using pre‑purchase behavior (call frequency, recipient diversity, device modality, country, age buckets), do matching/weighting (exact on country, caliper on propensity, overlap trimming), and compare top‑decile matching vs full IPW in terms of bias/variance.
4) Small treated share: perform a back‑of‑envelope power analysis assuming 1% treated, baseline weekly mean 30 minutes (SD 60), MDE 10% relative. Show how CUPED using pre‑period outcomes reduces variance and affects sample size.
5) Robustness: address survivorship and novelty effects, seasonality, clustering by household, and measurement noise in durations. Include sensitivity analysis (e.g., Rosenbaum bounds) and negative‑control outcomes.
6) Reporting: define uncertainty (cluster‑robust CIs), heterogeneity (by prior usage and country), and decision thresholds for launch/scale.
Quick Answer: This question evaluates applied causal inference and statistical analysis skills, including defining estimands, designing staggered-adoption difference‑in‑differences/event‑study regressions with fixed effects, propensity‑based selection control and matching/weighting, power analysis for low treated share, and robustness and sensitivity checks.