Design and critique teen-parent impact experiment
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
Meta plans to let parents register and link to their teen’s account. Leaders worry about negative impacts on teens (e.g., well-being, engagement quality). Design a study to estimate the causal effect of parental registration on teens.
Answer the following precisely:
1) Define success metrics and guardrails: list primary/secondary outcomes, directionality, and minimal detectable effect assumptions. Include concrete examples (e.g., time on platform, harmful-content impressions, report/mute rates, session length volatility, churn).
2) Experimental design: choose the unit of randomization (teen, parent, household, or cluster by social graph) and justify it under potential interference/SUTVA violations (e.g., messages between linked family members, peer spillovers). Propose a practical assignment mechanism (e.g., encouragement design, stepped-wedge rollout) and the exposure definition.
3) Power and duration: outline sample sizing inputs, expected compliance/takeup, and how you’d handle staggered adoption and late joiners.
4) Measurement and attribution: specify ITT vs. TOT, handling of partial compliance, mislinking, and attrition. Propose CUPED/covariate adjustment to improve precision.
5) Threats to validity and mitigations: identify at least five concrete risks (e.g., selection bias of families who opt in, network interference, policy-induced behavior changes, seasonality/back-to-school, measurement error in well-being proxies). For each, give a mitigation (e.g., household-level randomization, graph clustering, pre-exposure matching, difference-in-differences with teen fixed effects, synthetic controls, instrumental variables, exclusion windows).
6) If an RCT is infeasible, propose a credible quasi-experiment: specify the design (e.g., regression discontinuity at age thresholds, instrument via exogenous invite timing, diff-in-diff on the same teens before/after with matched controls), identification assumptions, diagnostics, and robustness checks.
7) Ethics and safety: define eligibility filters, safety holdouts, monitoring, and stop conditions for adverse outcomes. Describe how you’d communicate results and make a launch decision under uncertainty.
Quick Answer: This question evaluates a data scientist's competency in causal inference and experimental design for measuring the impact of parental registration and account linking on teen outcomes, including metric definition, power analysis, compliance, measurement, and ethical safeguards.