Design experiment for Group Calls with interference
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
Your team is adding a Group Calls feature to a 1:1 calling app. Design a robust experiment to measure its impact under likely network interference. Address: (a) Select exactly one primary metric and 3–5 guardrails; explain when "total call volume" is misleading (e.g., cannibalization, unequal exposure) and when it could be acceptable; propose user-normalized or rate-based alternatives. (b) Choose a randomization unit (e.g., graph clusters, ego-clusters, geo switchbacks) and a rollout plan that mitigates interference; compare bias, variance, and engineering complexity trade-offs. (c) If cluster-based randomization is not feasible, propose a quasi-experimental plan: e.g., invitation-gated treatment, time-based/switchback assignment, exposure-weighted estimators, IV using staggered access, or difference-in-differences with pre-periods; list assumptions and diagnostics you will run (placebo tests, balance checks, spillover detection). (d) Define exposure, triggers, and exclusion criteria (e.g., creators vs joiners; first exposure vs ever-exposed); detail contamination controls (e.g., invite tokens only visible to treated egos). (e) Pre-register analysis, specify minimal detectable effect with plausible baselines, seasonality controls, and how you'll handle novelty and ramp-up effects.
Quick Answer: This question evaluates a data scientist's competency in experimental design and causal inference for networked features, covering metrics selection (primary and guardrail metrics), strategies for mitigating interference via randomization or quasi-experimental approaches, exposure and contamination controls, rollout planning, and pre-registration and power analysis. Commonly asked in Analytics & Experimentation interviews for Data Scientist roles, it assesses reasoning about bias–variance trade-offs, engineering constraints, and validity diagnostics in interconnected user environments, requiring both conceptual understanding of identification and practical application of experiment implementation.