Evaluate friend-interaction feature with network interference
Company: Roblox
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: HR Screen
You plan to ship a “friend interaction boost” that ranks feed content higher when friends interact with it. Because interactions propagate through the social graph, standard user-level randomization violates SUTVA.
Design an experiment that credibly estimates causal lift:
1) Randomization unit: Choose and defend graph-cluster randomization (e.g., Louvain clusters) vs ego-network clustering vs geo/time switchbacks. How will you quantify and cap cross-cluster edge cut ratio and exposure contamination?
2) Metrics: Define primary (session time, meaningful interactions) and guardrail metrics (spam reports, creator revenue cannibalization). Specify exposure-weighted metrics for partially treated users.
3) Power: Given average cluster size m and intracluster correlation ρ, derive effective sample size n_eff ≈ (K·m) / [1+(m−1)ρ]. Show how this changes required duration vs user-level AB.
4) Analysis: Outline cluster-robust variance estimation, CUPED with pre-period outcomes, and intent-to-treat vs exposure-on-treated estimands. How do you handle creators whose audiences span treatment/control clusters?
5) Diagnostics & fallbacks: Pre-commit spillover checks, negative controls, and a holdout of high-degree nodes. If contamination is too high mid-test, propose a redesign that preserves inference while limiting blast radius.
Quick Answer: This question evaluates a data scientist's competency in causal inference and network-aware experiment design, covering randomization under interference, exposure-weighted metric specification, power estimation with intracluster correlation, and cluster-robust analysis.