Design a clustered notification experiment with guardrails
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: Medium
Interview Round: Technical Screen
You work on a mobile travel app (think TripAdvisor-like) that will test a new push-notification policy recommending nearby attractions. Design a rigorous online experiment that accounts for network effects and includes strong guardrails.
Address the following:
1) Define the primary success metric and justify it over plausible alternatives (e.g., 7-day retained sessions per user vs booking rate vs content views). Specify the exact numerator/denominator and the attribution window.
2) Specify guardrail metrics that must not regress (e.g., uninstall rate, notification unsubscribe rate, spam-report rate). Propose alert thresholds and whether they are one-sided or two-sided; explain how you will do sequential monitoring (e.g., spending functions) to avoid inflated Type I error.
3) Choose the cluster unit for randomization (e.g., city, language-locale, connected components in the follow graph). Explain spillover pathways and tradeoffs between larger vs smaller clusters, unequal cluster sizes, and cross-border travelers.
4) Show how you would power the test under cluster randomization. Define intracluster correlation ρ and average cluster size m, compute the design effect DE = 1 + (m−1)ρ, and illustrate with a numeric example how the required sample size changes vs individual randomization.
5) Describe your randomization and stratification plan (platform, notification eligibility, baseline activity), and how you would validate balance at both user and cluster levels.
6) Lay out the analysis plan: cluster-level difference-in-means vs user-level models with cluster-robust SEs vs mixed-effects with random intercepts; include covariate adjustment (e.g., pre-period outcomes, CUPED). Clarify how you will handle partial exposure and users moving across clusters.
7) Explain how you would detect and quantify spillovers (e.g., two-stage randomization, exposure variables like % of a user’s friends in treatment). How would this change your estimand and analysis?
8) Operational safeguards: staged rollout, early-stop criteria when guardrails breach, and what you monitor in the first 24–48 hours.
Make your answers precise and formula-driven where applicable.
Quick Answer: This question evaluates experimental-design and causal-inference competencies for cluster-randomized experiments, covering metric selection, power and design-effect calculations, intracluster correlation, spillover detection, sequential monitoring, guardrails, and operational rollout within the Analytics & Experimentation domain for a Data Scientist role. It is commonly asked to determine how applicants balance statistical rigor and real-world operational constraints—such as cluster choice, monitoring thresholds, and analysis strategy—testing both conceptual understanding and practical application in production experimentation.