Compute cluster-aware significance and sequential corrections
Company: TikTok
Role: Data Scientist
Category: Statistics & Math
Difficulty: medium
Interview Round: HR Screen
Consider a creator-level randomized experiment for the tipping UI. Per arm, 10,000 creators are assigned; each creator has on average m = 100 viewer sessions in the analysis window. The viewer-level purchase rate is 5.00% in control and 5.20% in treatment. The intra-cluster correlation of purchase within a creator is ρ = 0.02. 1) Compute the design effect DE = 1 + (m − 1)ρ and the effective viewer-sample size per arm; then compute the z-statistic and two-sided p-value using cluster-robust standard errors implied by DE. 2) If you run 4 interim looks plus a final analysis, approximate an O’Brien–Fleming-style overall α = 0.05 spending by giving a conservative per-look α, and contrast with a naive Bonferroni correction; explain how these choices change power and required duration. 3) With four guardrail metrics, outline a Holm–Bonferroni adjustment and discuss when you would instead report Bayesian posterior intervals with a ROPE for practical significance.
Quick Answer: This question evaluates competency in clustered randomized experiment analysis, including calculation of design effect and effective sample size, cluster-robust inference for differences in proportions, sequential alpha spending (O’Brien–Fleming-style) and comparisons with Bonferroni, Holm–Bonferroni adjustments for multiple guardrail metrics, and Bayesian ROPE interpretation. It is in the Statistics & Math domain and is commonly asked to probe how candidates handle intra-cluster correlation, control Type I error across interim looks and multiple metrics, and demonstrate both conceptual understanding and practical application of power, duration, and multiplicity trade-offs.