Cluster-Randomized Experiment for a Social Feature with Spillovers
You are testing a social feature that likely produces network spillovers (peer effects). To limit interference, you will randomize at the geographic market cluster level (not by user).
-
Unit of Randomization and Estimand
-
Justify cluster-level randomization given spillovers.
-
Specify the estimand as a cluster-average treatment effect (CATE). Define it clearly.
-
Describe a realistic contamination scenario that would violate SUTVA if you randomized by user instead of by cluster.
-
Sample Size with ICC
-
Inputs: baseline conversion p0 = 10%, target absolute lift = +1 pp (p1 = 11%), α = 0.05 (two-sided), power = 0.80.
-
You have 200 clusters per arm, average m = 200 users observed per cluster, and intracluster correlation ICC = 0.06.
-
Compute the design effect DEFF = 1 + (m − 1)·ICC and the effective sample size N_eff per arm.
-
Explain how DEFF changes if you halve m but double the number of clusters (holding total users fixed).
-
Assignment and Balance
-
Describe a principled way to form clusters to minimize cross-cluster edges (e.g., graph partitioning) and to restrict leakage.
-
Explain how you will check pre-experiment balance (e.g., standardized mean differences on cluster-level covariates) and what thresholds you will use.
-
Gradual Change / Ramping Adoption
-
If adoption ramps gradually across treated clusters, propose an analysis plan (e.g., staggered adoption difference-in-differences with cluster and time fixed effects). Be explicit about outcome level (user vs cluster), fixed effects, and whether you use ITT and/or treatment-intensity.
-
State one identification assumption and one robustness check.
-
Metrics, Multiple Testing, and Early Stopping
-
Define primary, secondary, and guardrail metrics for this experiment.
-
Describe how you will control for multiple testing across metrics and how you will handle early stopping.