Design experiments and observational alternatives
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Onsite
Part A — Stories consumption: Data show higher story consumption on Facebook than Instagram. 1) Precisely define consumption (choose one primary, one secondary metric) and justify trade-offs. 2) List three falsifiable hypotheses (e.g., UI differences, ranking policy, notifications). 3) Design an A/B test on Instagram to test a UI change intended to raise consumption: target, unit of randomization, exposure rules, primary success metric, guardrails, sample-size/power inputs (MDE, baseline, variance), and runtime. 4) Preempt two common pitfalls (novelty, interference/cross-app contamination) and propose instrumentation to detect them. 5) If lift is observed but retention falls in a 14-day follow-up, show how you'd decide ship/no-ship using a decision framework (e.g., expected value with risk bounds) and pre-registered tie-breakers. 6) After overall results, go deep on user segmentation: choose one segment, define why it is behaviorally distinct, and specify how you avoid p-hacking when slicing.
Part B — When A/B is infeasible: Parents joining seems to reduce teen usage, but you cannot randomize. 1) Propose an observational design (choose one: difference-in-differences, propensity scores + weighting, or synthetic control). State the identification assumptions and how you would test pre-trends/overlap. 2) Define treatment, outcome, time windows, and covariates you need (include engagement history and social graph features). 3) Outline the analysis steps end-to-end, including diagnostics (balance, event-study plots, placebo tests) and a sensitivity analysis (e.g., Rosenbaum bounds). 4) Describe how you'd communicate residual uncertainty and make a product decision under imperfect identification.
Quick Answer: This question evaluates causal inference, experimental design, metric definition and measurement, power analysis, segmentation, and observational study methods within product analytics.