Measure network effects and spillovers via experiments
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
A new feature may generate network spillovers (e.g., hashtag following or invite‑a‑friend prompts). Design an experiment to measure direct and indirect effects under interference:
- Design: Propose a two‑stage randomized design (cluster‑level enablement + individual‑level encouragement) or graph clustering. Define exposure conditions (none, indirect, direct+indirect) and how units transition between them over time.
- Estimands: Precisely define and plan to estimate the Direct Effect, Indirect/Spillover Effect, and Total Effect. State your identifying assumptions and how you’ll check them.
- Power and clustering: How will you form clusters (modularity, size caps, overlap rules), handle cross‑cluster edges, and compute design effects for power?
- Analysis: Specify intent‑to‑treat vs. treatment‑on‑the‑treated, handling partial compliance, and variance estimators robust to clustering. Describe diagnostics for SUTVA violations and saturation.
- Guardrails and ethics: Define guardrails (abuse, feed health, notification fatigue) and fairness slices (locale, language, degree). Explain how you’ll decide to ship if indirect gains are positive but direct effects are neutral.
- Operational challenges: What if the social graph evolves mid‑experiment, or large influencers create asymmetric spillovers? Propose instrumentation and a reweighting or re‑cluster strategy to keep estimates unbiased.
Quick Answer: This question evaluates experimental design, causal inference, and network analysis competencies—specifically the ability to define and identify direct, indirect, and total effects under interference using exposure mappings—falling under the Analytics & Experimentation domain and combining conceptual understanding of identification with practical application of clustered or graph‑randomized designs. It is commonly asked in technical interviews to probe reasoning about estimands, power and clustering, compliance and variance estimation, diagnostics for SUTVA violations, and operational and ethical constraints when implementing experiments on evolving social graphs.