Identify Key Metrics to Address Delivery Delays
Company: DoorDash
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: medium
Interview Round: Technical Screen
##### Scenario
DoorDash, a food-delivery marketplace, is seeing growing customer complaints about orders arriving late. You are the data scientist asked to diagnose the problem and recommend a fix.
##### Question
Diagnose the root causes of delivery delays and design a validation experiment.
1. **Define the problem.** What does "late" mean here, and what is your primary KPI (e.g., on-time delivery rate) versus your secondary / guardrail metrics? Why?
2. **Metrics to examine first.** Which delivery-performance metrics would you look at first, and how do you instrument the order lifecycle so you can localize where the delay happens?
3. **Identify root causes.** How would you attribute excess delay to specific stages (assignment, courier travel, restaurant prep, pickup dwell, drop-off travel) and form hypotheses (courier supply vs. demand, prep-time underestimation, dispatch/batching logic, geography, time-of-day)?
4. **Segment the problem.** Which segments (region/zone, restaurant cohort, courier supply, order attributes, time/weather) would you cut by to localize the issue and prioritize?
5. **Design an experiment / product change.** Propose one solution and design an A/B or geo-holdout test to validate it: unit of randomization, primary outcome, guardrails, duration, power/sample size, and analysis plan. Address interference between nearby zones.
##### Hints
Clarify the lateness definition (initial promise vs. latest ETA) to avoid ETA-padding gaming. Decompose end-to-end time into stages and compare each stage to an expected baseline. Segment by region/restaurant/courier/time. Because courier supply is shared across nearby areas, prefer cluster (geo-zone) randomization over per-order randomization, and design for spillovers.
Quick Answer: A DoorDash data scientist analytics-and-experimentation screen on diagnosing delivery delays. It asks you to define on-time delivery and the right primary and guardrail metrics, decompose the order lifecycle into stages to localize root causes, segment by region/restaurant/courier/time, and design a cluster-randomized geo experiment to validate a fix.