Test conversion difference and adjust for clustering

Q: Test conversion difference and adjust for clustering

This question evaluates proficiency in statistical inference for A/B testing—estimating and comparing conversion proportions, conducting two-sided hypothesis tests, adjusting for day-level clustering using ICC and design-effect corrections, and performing power and sample-size calculations; it belongs to the Statistics & Math domain for a Data Scientist role and combines conceptual understanding with practical application. It is commonly asked to assess an interviewee's ability to interpret conversion uplift under realistic experimental constraints, account for intra-cluster correlation when estimating effective sample sizes and uncertainty, and reason about experiment duration and robustness checks.

Q: How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

Question

Using aggregated results for the 7‑day window 2025‑08‑26..2025‑09‑01, evaluate statistical significance and power for conversion uplift, accounting for day‑level clustering: Given totals: Control (C): visits n_C=10,240, bookings x_C=308; Treatment (T): visits n_T=10,180, bookings x_T=351.

Point estimates: compute p_C, p_T, absolute lift (p_T − p_C, in percentage points) and relative lift.
Significance: perform a two‑sided test for difference in proportions (unpooled standard error). Report z, p‑value, and a 95% CI for (p_T − p_C). State any continuity correction you apply.
Clustering: adjust for day‑level clustering with ICC=0.01 and 7 days per variant. Use design effect DE = 1 + (\bar{m} − 1)·ICC where \bar{m} = n_variant / 7. Recompute effective sample sizes n_eff = n / DE and provide an adjusted p‑value/CI. Explain assumptions and limitations of this correction.
Power and sample size: What total visits per variant are required to detect a 0.30 percentage‑point absolute lift from a 3.00% baseline at 80% power and alpha=0.05 using an unpooled z‑test? Show the formula and final n per variant. Then recompute with the design effect from ICC=0.01 to give a clustered n per variant and the implied experiment duration if each variant receives 2,000,000 visits/day.
Robustness: briefly describe how you would check day‑to‑day heterogeneity (e.g., Q‑test or interaction with weekday) and how that influences the decision to launch.

Test conversion difference and adjust for clustering

Overview

Comments (0)