PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/SIG (Susquehanna)

Compute sample sizes and error control

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in experimental design and applied inferential statistics—specifically sample size calculation for means and proportions, non-inferiority testing, multiple-comparison error control, cluster-randomized design effects, and sequential monitoring boundaries—within the Statistics & Math domain for a Data Scientist role. It is commonly asked to measure the ability to apply statistical formulas and error-control principles under practical constraints, balancing power and type I error while accounting for clustering and interim looks; the assessment emphasizes practical application grounded in conceptual understanding of inferential methods.

  • Medium
  • SIG (Susquehanna)
  • Statistics & Math
  • Data Scientist

Compute sample sizes and error control

Company: SIG (Susquehanna)

Role: Data Scientist

Category: Statistics & Math

Difficulty: Medium

Interview Round: Technical Screen

Using the Biker experiment context, compute required sample sizes and describe error control under practical constraints. Show formulas and numeric answers where possible. Assumptions: - Two-armed test (control vs Biker exposure). - Two-sided alpha unless stated otherwise. 1) Primary mean metric: Baseline mean delivery time = 42 min, SD = 15 min. Target relative improvement = −5%. Alpha = 0.05 (two-sided), power = 0.80. Compute per-arm sample size for a two-sample t-test on means. 2) Guardrail proportion metric: Baseline cancellation rate = 6%. You require non-inferiority with margin +0.5 percentage points (i.e., Biker cancel rate ≤ 6.5%). One-sided alpha = 0.05, power = 0.80. Compute per-arm sample size for a non-inferiority test on proportions. 3) Multiple metrics: You have 1 primary, 2 guardrails (cancellations, ETA accuracy), and 1 secondary (orders per courier-hour). Propose and justify an error-control approach (e.g., Bonferroni, Holm/Hochberg, gatekeeping/HMP). State the effective alpha for each family and how you’d report adjusted CIs. 4) Cluster randomization: You randomize at the zone-day level with average m = 300 orders per cluster and ICC = 0.03 for the primary metric. Compute the design effect DE and the adjusted per-arm sample size. How many zone-days per arm are needed to reach that sample size? 5) Sequential monitoring: You plan 4 equally spaced looks with O’Brien–Fleming spending. Explain qualitatively how early critical values differ from the final look and how this affects runtime/MDE. Provide the final-look alpha spending approximation and how you’d implement boundary checks in practice.

Quick Answer: This question evaluates proficiency in experimental design and applied inferential statistics—specifically sample size calculation for means and proportions, non-inferiority testing, multiple-comparison error control, cluster-randomized design effects, and sequential monitoring boundaries—within the Statistics & Math domain for a Data Scientist role. It is commonly asked to measure the ability to apply statistical formulas and error-control principles under practical constraints, balancing power and type I error while accounting for clustering and interim looks; the assessment emphasizes practical application grounded in conceptual understanding of inferential methods.

Related Interview Questions

  • Determine roles from B's accusation - SIG (Susquehanna) (easy)
  • Infer posterior after losing pick - SIG (Susquehanna) (easy)
  • Determine departure time from travel times - SIG (Susquehanna) (medium)
  • Compute ruin probability with bold betting - SIG (Susquehanna) (medium)
  • Solve relative-truth cat puzzle - SIG (Susquehanna) (medium)
SIG (Susquehanna) logo
SIG (Susquehanna)
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Statistics & Math
5
0

Using the Biker experiment context, compute required sample sizes and describe error control under practical constraints. Show formulas and numeric answers where possible.

Assumptions:

  • Two-armed test (control vs Biker exposure).
  • Two-sided alpha unless stated otherwise.
  1. Primary mean metric: Baseline mean delivery time = 42 min, SD = 15 min. Target relative improvement = −5%. Alpha = 0.05 (two-sided), power = 0.80. Compute per-arm sample size for a two-sample t-test on means.
  2. Guardrail proportion metric: Baseline cancellation rate = 6%. You require non-inferiority with margin +0.5 percentage points (i.e., Biker cancel rate ≤ 6.5%). One-sided alpha = 0.05, power = 0.80. Compute per-arm sample size for a non-inferiority test on proportions.
  3. Multiple metrics: You have 1 primary, 2 guardrails (cancellations, ETA accuracy), and 1 secondary (orders per courier-hour). Propose and justify an error-control approach (e.g., Bonferroni, Holm/Hochberg, gatekeeping/HMP). State the effective alpha for each family and how you’d report adjusted CIs.
  4. Cluster randomization: You randomize at the zone-day level with average m = 300 orders per cluster and ICC = 0.03 for the primary metric. Compute the design effect DE and the adjusted per-arm sample size. How many zone-days per arm are needed to reach that sample size?
  5. Sequential monitoring: You plan 4 equally spaced looks with O’Brien–Fleming spending. Explain qualitatively how early critical values differ from the final look and how this affects runtime/MDE. Provide the final-look alpha spending approximation and how you’d implement boundary checks in practice.

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More SIG (Susquehanna)•More Data Scientist•SIG (Susquehanna) Data Scientist•SIG (Susquehanna) Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.