Design a study to compare social vs game engagement
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: Medium
Interview Round: Onsite
Hypothesis: Users who use the 'social' category are more regularly engaged than users who use the 'game' category. Using data from 2025-08-04 to 2025-09-01, design a rigorous analysis plan to evaluate this claim. Answer all parts:
1) Define 'regularly engaged' precisely (e.g., ≥10 active days in 28 days) and choose a primary metric and 2–3 guardrail metrics. Justify each choice and propose an analysis-ready metric definition that is robust to outliers and seasonality.
2) Recommend an ideal randomized experiment to test the hypothesis (e.g., onboarding nudge that shifts first-week exposure to social vs game). Describe randomization unit, stratification variables, primary analysis, guardrails, and how you will prevent contamination and novelty effects.
3) If randomization is infeasible and you must use observational data, propose a causal inference approach (e.g., propensity score weighting or matching) specifying covariates to control (tenure, device, country, acquisition channel, baseline activity, weekday mix, etc.), diagnostics to validate overlap/balance, and a sensitivity analysis for unobserved confounding.
4) Powering: Suppose in 'game_only' users the baseline share that is 'regularly engaged' is p0 = 0.35. You want to detect a +3 percentage point absolute lift (MDE = 0.03) with α = 0.05 (two-sided) and 1−β = 0.80. Compute the required sample size per group for a two-proportion Z-test and state all formulas/assumptions.
5) Estimation and inference: Describe the primary estimator and statistical test you will use (e.g., difference in proportions with cluster-robust SEs if randomization is by user; or a logistic regression with covariates and robust SEs). Explain how you would handle multiple comparisons and interim looks.
6) Threats to validity: List at least five concrete risks (e.g., reverse causality—more engaged users choose social; misclassification of category; taxonomy drift; bots/multi-accounts; seasonality; geographic shocks) and how you would detect/mitigate each.
7) Decision-making: Define a clear decision rule using the primary metric, practical significance threshold, and guardrails. Include how you would communicate results to product and what follow-up you would run if the effect is heterogeneous across tenure cohorts.
Quick Answer: This question evaluates experimental design, causal inference, metric selection and robustness, power calculation, and statistical estimation competencies central to data science and product analytics.