PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/DoorDash

Compute power and interpret guardrails

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in experimental design and applied statistics for cluster-randomized A/B tests, covering cluster-robust inference, mean and proportion comparisons, power/MDE calculations with ICC and design effects, multiple-testing control, and sensitivity adjustments such as difference-in-differences and CUPED.

  • hard
  • DoorDash
  • Statistics & Math
  • Data Scientist

Compute power and interpret guardrails

Company: DoorDash

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: Onsite

An A/B test of a new search ranking shipped for one week across 100 DMAs (50 control, 50 treatment). Summaries: • Orders: control=1,000,000; treatment=1,050,000 (exposed by DMA-level randomization; assume no SRM unless shown). • Mean delivery time (minutes): control=32.4 (SD=9.1), treatment=31.9 (SD=9.5). • Cancellation rate: control=3.2%, treatment=3.5%. • Baseline conversion: 15% (per session), target MDE=+0.3 pp. • ICC across stores within a DMA: 0.15. Tasks: 1) Compute the difference in mean delivery time and a 95% CI using a cluster-robust approach at the DMA level. State the exact estimator and SE formula you use and report the test statistic and p-value. 2) Check for SRM: run a chi-squared test on assignment counts using per-DMA exposure. What threshold flags SRM at α=0.05? How would you diagnose if flagged? 3) Guardrail interpretation: despite faster delivery, cancellations rose by 0.3 pp. Conduct a two-proportion z-test (and a cluster-adjusted variant). Quantify the practical significance (risk difference and relative risk) and whether this violates a pre-specified guardrail of “no increase >0.2 pp (95% CI).” 4) Power/MDE: With 50 DMAs per arm and the stated ICC, compute the design effect and the required per-DMA sample for detecting a +0.3 pp conversion lift at 80% power, α=0.05. Show formulas and numeric results. 5) Multiple metrics: You tracked 5 secondary metrics. Propose a Benjamini–Hochberg FDR=10% correction and illustrate with hypothetical p-values. When would you instead prefer Holm–Bonferroni? 6) Sensitivity: A mid-week outage hit 5 treatment DMAs. Explain a pre-registered diff-in-diff that uses last week as pre-period and weather/outage covariates, without introducing post-treatment bias. Include the regression specification with DMA and day fixed effects. 7) CUPED: Define a high-R² covariate (e.g., prior-week DMA mean delivery time) and write the CUPED-adjusted estimator for the treatment effect.

Quick Answer: This question evaluates competency in experimental design and applied statistics for cluster-randomized A/B tests, covering cluster-robust inference, mean and proportion comparisons, power/MDE calculations with ICC and design effects, multiple-testing control, and sensitivity adjustments such as difference-in-differences and CUPED.

Related Interview Questions

  • Define and compute surge pricing metrics - DoorDash (medium)
  • Define and compute retention and churn precisely - DoorDash (hard)
  • Calculate power and test duration - DoorDash (medium)
  • Forecast and Analyze DoorDash Menu Price Inflation Gap - DoorDash (medium)
  • Design A/B Test to Evaluate Algorithm's Revenue Impact - DoorDash (hard)
DoorDash logo
DoorDash
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
2
0
Loading...

Context

You ran a 1-week A/B test of a new search ranking with clustered randomization at the DMA level: 100 DMAs total (50 control, 50 treatment). Outcomes are aggregated from per-order/per-session data. Unless stated, assume no Sample Ratio Mismatch (SRM).

Given summaries:

  • Orders: control = 1,000,000; treatment = 1,050,000
  • Mean delivery time (minutes): control = 32.4 (SD = 9.1); treatment = 31.9 (SD = 9.5)
  • Cancellation rate: control = 3.2%; treatment = 3.5%
  • Baseline conversion: 15% (per session), target MDE for conversion = +0.3 percentage points (pp)
  • Intra-cluster correlation (ICC) across stores within a DMA for conversion = 0.15

Assumptions to complete missing context:

  • Cluster-robust inference is at the DMA level. Where DMA-level variance of cluster means is not provided, we approximate it using within-arm SDs and average per-DMA sample sizes, noting this can be optimistic if there is between-DMA heterogeneity.
  • For cancellation and delivery time, we treat orders as the unit of analysis; for power/MDE, sessions are the relevant unit for conversion.

Tasks

  1. Difference in mean delivery time: compute the treatment–control difference and a 95% CI using a cluster-robust approach at the DMA level. State the estimator and SE formula you use, and report the test statistic and p-value.
  2. SRM check: run a chi-squared test on assignment counts using per-DMA exposure. What threshold flags SRM at α = 0.05? If flagged, how would you diagnose?
  3. Guardrail interpretation: despite faster delivery, cancellations rose by 0.3 pp. Conduct a two-proportion z-test and a cluster-adjusted variant. Quantify practical significance (risk difference and relative risk) and assess the guardrail “no increase > 0.2 pp (95% CI).”
  4. Power/MDE: With 50 DMAs per arm and ICC = 0.15 (for conversion), compute the design effect and the required per-DMA sample to detect a +0.3 pp lift at 80% power, α = 0.05. Show formulas and numeric results.
  5. Multiple metrics: You tracked 5 secondary metrics. Propose a Benjamini–Hochberg FDR = 10% correction and illustrate with hypothetical p-values. When would you instead prefer Holm–Bonferroni?
  6. Sensitivity: A mid-week outage hit 5 treatment DMAs. Explain a pre-registered difference-in-differences using last week as pre-period and weather/outage covariates, avoiding post-treatment bias. Provide the regression with DMA and day fixed effects.
  7. CUPED: Define a high-R² covariate (e.g., prior-week DMA mean delivery time) and write the CUPED-adjusted estimator for the treatment effect.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More DoorDash•More Data Scientist•DoorDash Data Scientist•DoorDash Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.