PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Statistics & Math/TikTok

Compute cluster-aware significance and sequential corrections

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in clustered randomized experiment analysis, including calculation of design effect and effective sample size, cluster-robust inference for differences in proportions, sequential alpha spending (O’Brien–Fleming-style) and comparisons with Bonferroni, Holm–Bonferroni adjustments for multiple guardrail metrics, and Bayesian ROPE interpretation. It is in the Statistics & Math domain and is commonly asked to probe how candidates handle intra-cluster correlation, control Type I error across interim looks and multiple metrics, and demonstrate both conceptual understanding and practical application of power, duration, and multiplicity trade-offs.

  • medium
  • TikTok
  • Statistics & Math
  • Data Scientist

Compute cluster-aware significance and sequential corrections

Company: TikTok

Role: Data Scientist

Category: Statistics & Math

Difficulty: medium

Interview Round: HR Screen

Consider a creator-level randomized experiment for the tipping UI. Per arm, 10,000 creators are assigned; each creator has on average m = 100 viewer sessions in the analysis window. The viewer-level purchase rate is 5.00% in control and 5.20% in treatment. The intra-cluster correlation of purchase within a creator is ρ = 0.02. 1) Compute the design effect DE = 1 + (m − 1)ρ and the effective viewer-sample size per arm; then compute the z-statistic and two-sided p-value using cluster-robust standard errors implied by DE. 2) If you run 4 interim looks plus a final analysis, approximate an O’Brien–Fleming-style overall α = 0.05 spending by giving a conservative per-look α, and contrast with a naive Bonferroni correction; explain how these choices change power and required duration. 3) With four guardrail metrics, outline a Holm–Bonferroni adjustment and discuss when you would instead report Bayesian posterior intervals with a ROPE for practical significance.

Quick Answer: This question evaluates competency in clustered randomized experiment analysis, including calculation of design effect and effective sample size, cluster-robust inference for differences in proportions, sequential alpha spending (O’Brien–Fleming-style) and comparisons with Bonferroni, Holm–Bonferroni adjustments for multiple guardrail metrics, and Bayesian ROPE interpretation. It is in the Statistics & Math domain and is commonly asked to probe how candidates handle intra-cluster correlation, control Type I error across interim looks and multiple metrics, and demonstrate both conceptual understanding and practical application of power, duration, and multiplicity trade-offs.

Related Interview Questions

  • Explain Type I/II errors vs precision/recall - TikTok (easy)
  • Model overdispersed counts; estimate treatment lift - TikTok (Medium)
  • Decide if subgroup increases imply overall increase - TikTok (medium)
  • Control confounding in observational ad lift - TikTok (hard)
  • Act when A/B result is not significant - TikTok (hard)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
HR Screen
Statistics & Math
1
0

Cluster-Randomized Tipping UI Experiment: Power, Sequential Testing, and Multiplicity

Context: A creator-level (cluster) randomized experiment evaluates a tipping UI. Creators are clusters; viewers are units within clusters. The outcome is a binary purchase at the viewer-session level.

Given:

  • Per arm: 10,000 creators (clusters)
  • Average viewer sessions per creator: m = 100
  • Viewer-level purchase rate: control p_c = 5.00%, treatment p_t = 5.20%
  • Intra-cluster (creator) correlation of purchase: ρ = 0.02

Tasks:

  1. Compute the design effect DE = 1 + (m − 1)ρ and the effective viewer-level sample size per arm. Then compute the z-statistic and two-sided p-value for the difference in proportions using cluster-robust standard errors implied by DE.
  2. Suppose you plan 4 interim looks plus a final analysis (5 looks total). Provide an approximate O’Brien–Fleming-style spending schedule for overall α = 0.05 by giving conservative per-look two-sided α thresholds (assume equally spaced looks). Contrast this with a naive Bonferroni correction. Explain how these choices affect power and required duration.
  3. With four guardrail metrics, outline a Holm–Bonferroni adjustment procedure. Discuss when you might instead report Bayesian posterior intervals with a ROPE (Region of Practical Equivalence) to focus on practical significance.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.