PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Analytics & Experimentation/Meta

Design a clustered A/B test with spillovers

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's understanding of cluster-randomized experiments with spillovers, covering causal inference under interference, intracluster correlation and power/sample-size calculations, cluster formation and balance checks, staggered-adoption analysis, and metrics and multiple-testing control.

  • hard
  • Meta
  • Analytics & Experimentation
  • Data Scientist

Design a clustered A/B test with spillovers

Company: Meta

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: HR Screen

You need to test a social feature likely to cause network spillovers. You will randomize by geographic market clusters, not by user. 1) Unit of randomization: Justify cluster-level randomization and specify the estimand (cluster-average treatment effect). Define a contamination scenario that would violate SUTVA if you randomized by user. 2) Sample size with ICC: Baseline conversion = 10%, target absolute lift = +1 pp, α=0.05 (two-sided), power=0.80. You have 200 clusters per arm with average m=200 users observed per cluster and intracluster correlation ICC=0.06. Compute the design effect DEFF = 1 + (m−1)·ICC and the effective sample size N_eff per arm. Explain how DEFF changes if you halve m but double the number of clusters (holding total users fixed). 3) Assignment: Describe a principled way to form clusters to minimize cross-cluster edges (e.g., graph partitioning) and how you’d check balance pre-experiment (standardized mean differences, cluster-level covariates). 4) Gradual change: If adoption ramps gradually across treated clusters, propose an analysis plan (e.g., staggered adoption difference-in-differences with cluster and time fixed effects). State one assumption required for identification and one robustness check. 5) Guardrails and metrics: Define primary, secondary, and guardrail metrics. Specify how you will handle multiple testing and early stopping.

Quick Answer: This question evaluates a data scientist's understanding of cluster-randomized experiments with spillovers, covering causal inference under interference, intracluster correlation and power/sample-size calculations, cluster formation and balance checks, staggered-adoption analysis, and metrics and multiple-testing control.

Related Interview Questions

  • Measure scheduled posts feature success - Meta (medium)
  • Estimate ads ranking revenue impact - Meta (medium)
  • How should you evaluate unconnected content? - Meta (medium)
  • Should WhatsApp launch group calls? - Meta (medium)
  • How would you grow Meta products? - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
HR Screen
Analytics & Experimentation
2
0
Loading...

Cluster-Randomized Experiment for a Social Feature with Spillovers

You are testing a social feature that likely produces network spillovers (peer effects). To limit interference, you will randomize at the geographic market cluster level (not by user).

  1. Unit of Randomization and Estimand
  • Justify cluster-level randomization given spillovers.
  • Specify the estimand as a cluster-average treatment effect (CATE). Define it clearly.
  • Describe a realistic contamination scenario that would violate SUTVA if you randomized by user instead of by cluster.
  1. Sample Size with ICC
  • Inputs: baseline conversion p0 = 10%, target absolute lift = +1 pp (p1 = 11%), α = 0.05 (two-sided), power = 0.80.
  • You have 200 clusters per arm, average m = 200 users observed per cluster, and intracluster correlation ICC = 0.06.
  • Compute the design effect DEFF = 1 + (m − 1)·ICC and the effective sample size N_eff per arm.
  • Explain how DEFF changes if you halve m but double the number of clusters (holding total users fixed).
  1. Assignment and Balance
  • Describe a principled way to form clusters to minimize cross-cluster edges (e.g., graph partitioning) and to restrict leakage.
  • Explain how you will check pre-experiment balance (e.g., standardized mean differences on cluster-level covariates) and what thresholds you will use.
  1. Gradual Change / Ramping Adoption
  • If adoption ramps gradually across treated clusters, propose an analysis plan (e.g., staggered adoption difference-in-differences with cluster and time fixed effects). Be explicit about outcome level (user vs cluster), fixed effects, and whether you use ITT and/or treatment-intensity.
  • State one identification assumption and one robustness check.
  1. Metrics, Multiple Testing, and Early Stopping
  • Define primary, secondary, and guardrail metrics for this experiment.
  • Describe how you will control for multiple testing across metrics and how you will handle early stopping.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Meta•More Data Scientist•Meta Data Scientist•Meta Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.