PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Statistics & Math/Uber

Formulate hypotheses and compute AB test significance

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in experimental design and statistical inference for A/B testing, covering hypothesis formulation, difference-in-proportions testing and confidence intervals, guardrail analysis, multiple-testing correction, interim alpha-spending approaches, and variance-reduction techniques such as CUPED.

  • hard
  • Uber
  • Statistics & Math
  • Data Scientist

Formulate hypotheses and compute AB test significance

Company: Uber

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: Technical Screen

Using the following A/B test snapshot for the pickup ETA card experiment, answer all parts. Data (7-day snapshot): - Primary metric (trip completion rate per request): • Control A: nA = 50,000 requests, cA = 6,000 completions • Treatment B: nB = 50,000 requests, cB = 6,420 completions - Guardrail 1 (rider cancel rate per request): • Control A: cancelsA = 4,500 • Treatment B: cancelsB = 4,950 - Guardrail 2 (wait time minutes, per request): • A: meanA = 4.8, sdA = 3.2, nA = 50,000 • B: meanB = 4.7, sdB = 3.4, nB = 50,000 - There were 5 interim looks at equally spaced information times with no pre-registered alpha spending. Tasks: 1) State precise H0 and H1 for the primary metric; specify one- vs two-sided and justify. 2) Choose the appropriate test for the primary metric (difference in proportions) and compute: test statistic, p-value, and a 95% CI for the lift. Show formulas and numeric results. 3) For Guardrail 2 (mean wait time), select the correct test (e.g., Welch’s t-test) and compute the 95% CI of the mean difference. State any distributional assumptions and why Welch vs pooled. 4) Perform a multiple-testing correction across the three outcomes (Primary, Guardrail 1, Guardrail 2) using Holm–Bonferroni at familywise α = 0.05. Identify which effects remain significant. 5) Explain, in plain language, what the p-value you computed in (2) does and does not mean. 6) Given the unplanned 5 interim looks, re-evaluate significance using a simple Pocock or O’Brien–Fleming alpha-spending approach (outline the approach and provide an approximate adjusted conclusion; exact boundaries not required but justify your decision). 7) If pre-period completion rate per rider has correlation r = 0.40 with the in-experiment outcome, estimate the approximate variance reduction from CUPED and discuss how that would change required sample size or interpretation. 8) Conclude: ship, iterate, or stop? Defend your decision considering the guardrails.

Quick Answer: This question evaluates a data scientist's competency in experimental design and statistical inference for A/B testing, covering hypothesis formulation, difference-in-proportions testing and confidence intervals, guardrail analysis, multiple-testing correction, interim alpha-spending approaches, and variance-reduction techniques such as CUPED.

Related Interview Questions

  • Should Uber double member discounts? - Uber (medium)
  • Compare Two Coin Proportions - Uber (medium)
  • Analyze the Accident-Rate Spike - Uber (easy)
  • How do you derive CDF from a PDF? - Uber (easy)
  • Derive a CDF from a PDF - Uber (medium)
Uber logo
Uber
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Statistics & Math
10
0

A/B Test Snapshot: Pickup ETA Card Experiment

You are analyzing a 7-day A/B test with equal allocation. Each request is an exposure; the primary outcome is completion per request. Two guardrails monitor safety/experience. Assume independent observations and large-sample approximations are acceptable.

Data (7-day snapshot):

  • Primary metric (trip completion rate per request):
    • Control A: nA = 50,000 requests, cA = 6,000 completions
    • Treatment B: nB = 50,000 requests, cB = 6,420 completions
  • Guardrail 1 (rider cancel rate per request):
    • Control A: cancelsA = 4,500
    • Treatment B: cancelsB = 4,950
  • Guardrail 2 (wait time, minutes per request):
    • A: meanA = 4.8, sdA = 3.2, nA = 50,000
    • B: meanB = 4.7, sdB = 3.4, nB = 50,000
  • There were 5 interim looks at equally spaced information times with no pre-registered alpha spending.

Tasks:

  1. State precise H0 and H1 for the primary metric; specify one- vs. two-sided and justify.
  2. Choose the appropriate test for the primary metric (difference in proportions) and compute: test statistic, p-value, and a 95% CI for the lift. Show formulas and numeric results.
  3. For Guardrail 2 (mean wait time), select the correct test (e.g., Welch’s t-test) and compute the 95% CI of the mean difference. State any distributional assumptions and why Welch vs. pooled.
  4. Perform a multiple-testing correction across the three outcomes (Primary, Guardrail 1, Guardrail 2) using Holm–Bonferroni at familywise α = 0.05. Identify which effects remain significant.
  5. Explain, in plain language, what the p-value you computed in (2) does and does not mean.
  6. Given the unplanned 5 interim looks, re-evaluate significance using a simple Pocock or O’Brien–Fleming alpha-spending approach (outline the approach and provide an approximate adjusted conclusion; exact boundaries not required but justify your decision).
  7. If pre-period completion rate per rider has correlation r = 0.40 with the in-experiment outcome, estimate the approximate variance reduction from CUPED and discuss how that would change required sample size or interpretation.
  8. Conclude: ship, iterate, or stop? Defend your decision considering the guardrails.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Uber•More Data Scientist•Uber Data Scientist•Uber Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.