PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Disney

Compute sample size and plan experiment

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's proficiency in experimental design and applied statistics, including sample size and power calculations, variance reduction (CUPED), clustering and design-effect adjustments, interim analysis with O'Brien–Fleming alpha spending, multiple-testing control (Benjamini–Hochberg), and causal estimands such as ITT versus CACE. It is commonly asked because interviewers need assurance that a practitioner can translate business treatment goals into a rigorous experiment plan that balances Type I/II error, multiplicity, noncompliance and operational constraints; this falls under the Statistics & Math domain and emphasizes practical application grounded in conceptual understanding.

  • hard
  • Disney
  • Statistics & Math
  • Data Scientist

Compute sample size and plan experiment

Company: Disney

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: HR Screen

A product team wants to A/B test a paywall copy change targeting new signups to improve the next-day subscription start rate. Given: - Baseline next-day subscription start rate among new signups: 18%. - Minimum detectable effect: +7% relative (i.e., lift to 19.26%). - Two-sided α=0.05, power=0.80, equal allocation. - Optionally apply CUPED with R^2=0.25 using pre-experiment engagement. - If randomizing by household, the average household size m=1.8 (signups per household) and ICC=0.06. - Plan up to 4 interim looks with O'Brien–Fleming spending. - There are 12 secondary metrics; control FDR at 10% using Benjamini–Hochberg. - 10% of users assigned to treatment will not actually see the new copy (noncompliance), and 3% of control users may be exposed due to caching. Questions: 1) Compute the per-arm sample size ignoring variance reduction and clustering. Show formulas and approximations used. 2) With CUPED (R^2=0.25), what is the effective sample size reduction? Recompute the required per-arm sample. 3) Adjust for household clustering via the design effect DE=1+(m−1)×ICC. Recompute the per-arm sample size under clustering (with and without CUPED). 4) Describe how O'Brien–Fleming boundaries alter Type I error allocation and the practical implications for timeline/power. 5) State how you would control FDR at 10% across 12 secondary metrics and interpret discoveries. 6) Compute the ITT vs CACE effect given the noncompliance rates (assume monotonicity). How would you report both responsibly to product stakeholders?

Quick Answer: This question evaluates a candidate's proficiency in experimental design and applied statistics, including sample size and power calculations, variance reduction (CUPED), clustering and design-effect adjustments, interim analysis with O'Brien–Fleming alpha spending, multiple-testing control (Benjamini–Hochberg), and causal estimands such as ITT versus CACE. It is commonly asked because interviewers need assurance that a practitioner can translate business treatment goals into a rigorous experiment plan that balances Type I/II error, multiplicity, noncompliance and operational constraints; this falls under the Statistics & Math domain and emphasizes practical application grounded in conceptual understanding.

Disney logo
Disney
Oct 13, 2025, 9:49 PM
Data Scientist
HR Screen
Statistics & Math
3
0

A/B Test Planning: Paywall Copy Change For New Signups

Context

You are planning an A/B test to evaluate a paywall copy change that targets new signups, with the objective of improving the next-day subscription start rate. The experiment will use equal allocation and two-sided testing.

Given:

  • Baseline next-day subscription start rate among new signups: 18%.
  • Minimum detectable effect (MDE): +7% relative (target lift to 19.26%).
  • Two-sided α = 0.05, power = 0.80, equal allocation.
  • Optional CUPED variance reduction using pre-experiment engagement with R² = 0.25.
  • If randomizing by household: average household size m = 1.8 (signups per household), ICC = 0.06.
  • Up to 4 interim looks with O'Brien–Fleming (OBF) alpha-spending.
  • There are 12 secondary metrics; control FDR at 10% using Benjamini–Hochberg (BH).
  • Noncompliance: 10% of treated users do not actually see the new copy; 3% of control users are exposed due to caching.

Tasks

  1. Compute the per-arm sample size ignoring variance reduction and clustering. Show formulas and approximations used.
  2. With CUPED (R² = 0.25), what is the effective sample size reduction? Recompute the required per-arm sample.
  3. Adjust for household clustering via the design effect DE = 1 + (m − 1) × ICC. Recompute the per-arm sample size under clustering (with and without CUPED).
  4. Describe how O'Brien–Fleming boundaries alter Type I error allocation and the practical implications for timeline/power.
  5. State how you would control FDR at 10% across 12 secondary metrics and interpret discoveries.
  6. Compute the ITT vs CACE effect given the noncompliance rates (assume monotonicity). How would you report both responsibly to product stakeholders?

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Disney•More Data Scientist•Disney Data Scientist•Disney Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.