Explain why DoorDash and job change
Company: DoorDash
Role: Data Scientist
Category: Behavioral & Leadership
Difficulty: hard
Interview Round: Onsite
Why DoorDash, and why now? Give a specific, recent example (within the last 12 months) where you proactively drove a cross-functional decision with product/ops/engineering under time pressure. Quantify your impact (e.g., conversion, cancellations, delivery time, cost) and explain the exact metric trade-offs you accepted. Why are you leaving your current role—what concrete limitations are you looking to overcome here? Describe a time you navigated conflict between improving customer wait time and maintaining dasher earnings; what was your decision rubric, how did you handle stakeholder pushback in the moment, and what would you change if you had to redo it?
Quick Answer: This question evaluates a data scientist's cross-functional leadership, marketplace and operations judgment, and ability to quantify trade-offs across customer experience, worker earnings, cost, and reliability within a behavioral & leadership interview for a Data Scientist role.
Solution
# How to approach and what good looks like
Use STAR for storytelling and a clear decision rubric for trade-offs. Define primary metrics up front and set guardrails.
Key marketplace metrics to reference
- Customer: conversion rate, cancellations, P50/P90 delivery time, ETA accuracy, NPS/support contacts.
- Dasher: earnings per active hour (EPH = total pay ÷ active hours), acceptance rate, idle/store-wait minutes, retention.
- Business: cost per order (CPO), throughput (orders/hour), reliability (outage rate), GMV.
Guardrails and trade-offs
- Clearly state which metrics are optimized vs. guarded. Example guardrails: EPH P25 not down >1%; P90 delivery time not worse by >0.5 min; cancellations not up; CPO not up >$0.05.
- If needed, use a simple objective function: Utility = w1*(−P90 Wait) + w2*(−Cancellation) + w3*(EPH) + w4*(−CPO), with guardrails on sensitive metrics.
---
Example answers you can adapt
1) Why DoorDash, and why now?
- Why: I want to work on a high-frequency, two-sided marketplace where small modeling decisions materially change real-world outcomes. The space combines causal inference, online experimentation, and marketplace optimization (dispatching, batching, pricing, ETAs). Data Science here is directly tied to product levers and speed-to-impact.
- Why now: I’ve grown from model building to end-to-end ownership. I’m looking for higher scale, faster iteration cycles, and a culture where DS partners closely with product/ops/engineering to ship, measure, and iterate weekly rather than quarterly.
2) Cross-functional decision under time pressure (within last 12 months)
Situation (S): In Q1 this year, we saw rising late deliveries and cancellations in two large regions tied to a handful of high-volume merchants. Diagnostics showed our prep-time predictions were underestimating variance; dashers were arriving after food was ready, driving cold-food complaints and cancellations. We had 9 days before a major promotional weekend expected to spike volume.
Task (T): Reduce cancellations and P90 customer wait time ahead of the event without materially harming dasher earnings or overall cost per order.
Action (A):
- Data Science: Built a merchant-level dynamic pickup window v1 using historical prep-time distributions (median and P90) and order features (daypart, items, load). We translated this into a simple policy for v1: pull-forward dispatch for only those merchants where predicted incremental dasher store-wait ≤ 5 minutes (to protect EPH), and cap the pull-forward at 7 minutes.
- Engineering: Implemented gating logic and a feature flag with a kill switch; instrumented pickup arrival vs. ready-time deltas; created a 10% holdout for backtesting parity.
- Product: Set success metrics and guardrails: primary = −P90 customer wait, −cancellations; guardrails = EPH P25 ≥ −1%, CPO ≤ +$0.05.
- Operations: Briefed top 50 merchants on expected pickup timing changes; ensured staff scheduling aligned with the new pickup window; set up a war-room to monitor live metrics hourly over the weekend.
Result (R) after a 1-week ramp in 2 regions (A/B vs. holdout):
- P90 customer wait: −3.1 minutes
- Cancellations: −9.6% (relative)
- On-time delivery rate: +4.8 pp
- Dasher EPH (median): −0.4%; EPH P25: −0.7% (within −1% guardrail)
- Acceptance rate: −0.2 pp (slight drop due to modestly higher store-wait variability)
- Cost per order: +$0.06 (slightly above our +$0.05 target; we accepted temporarily for the high-visibility weekend, with a follow-up to fund offset via better stacking in off-peak hours)
Exact trade-offs accepted and why:
- Accepted a small EPH decrease (−0.4% median; P25 −0.7%) and +$0.06 CPO to secure large reductions in P90 wait (−3.1 min) and cancellations (−9.6%), which improved customer retention and reduced support contacts (−6.3%). Economic modeling showed net positive order-level margin when accounting for fewer refunds/reships and higher repeat purchase probability.
How I drove it cross-functionally under time pressure:
- Pre-aligned on guardrails to avoid last-minute disagreements. Built a simple v1 heuristic instead of a complex model to ship in 9 days. Set an hourly live-monitoring dashboard and a revert criterion: EPH P25 < −1.5% for 2 consecutive hours would auto-disable.
3) Why I’m leaving my current role — concrete limitations I want to overcome
- Limitations: Long deployment cycles (quarterly launches), limited online experimentation tooling, and fragmented ownership where DS is a service function rather than accountable for outcomes. This constrained my ability to iterate fast, run robust causal tests, and tie DS work to product levers.
- What I seek here: End-to-end ownership from problem framing to post-launch iteration; deeper work on marketplace levers (dispatch timing, batching, pricing, supply incentives); mature experimentation and data platforms; and a culture that embraces fast, data-driven iteration.
4) Navigating customer wait time vs. dasher earnings
Situation (S): In late spring, early-dispatch policies reduced customer wait time but increased dasher store-wait, depressing EPH and spiking negative feedback from dashers. Product wanted to move even earlier on dispatch to further cut customer P90 wait; Ops flagged rising dasher dissatisfaction.
Task (T): Establish a decision rubric that balances customer wait time with dasher EPH, decide whether to expand early dispatch, and manage stakeholder pushback.
Action (A):
- Decision rubric: Optimized for −P90 customer wait and −cancellations; guardrails on dasher EPH P25 (≥ −1%) and acceptance rate (≥ −0.5 pp). Objective function (for communication): Utility = 0.4*(−P90 Wait) + 0.3*(−Cancellation Rate) + 0.2*(EPH) + 0.1*(−CPO), subject to guardrails.
- Policy: Apply early dispatch only where predicted incremental store-wait ≤ 4 minutes, and dynamically relax to 6 minutes during peak if supply is abundant. For merchants with high prep-time variance, switch to just-in-time dispatch to protect EPH.
- Experimentation: Region-level A/B with stratification by merchant and daypart; pre-registered analysis plan to avoid metric fishing. Monitored P25/P50/P90 for EPH and wait time to catch tail risks.
- Stakeholder pushback: Product pushed for more aggressive early-dispatch thresholds. I presented counterfactuals: in backtests, moving from 4 to 8 minutes increased EPH P25 drop from −0.6% to −2.1% with minimal additional P90 wait gains (diminishing returns). Ops worried about merchant congestion; we added a cap on concurrent early-dispatch pickups per merchant.
Result (R):
- Customer P90 wait: −1.7 minutes
- Cancellations: −3.8%
- Dasher EPH (median): +3.2%; EPH P25: −0.3% (within guardrail)
- Acceptance rate: +0.9 pp
- CPO: −$0.09 (less paid idle time; fewer refunds)
What I’d change if I did it again:
- Build the simulation earlier to tune thresholds by zone-level supply elasticity; we discovered post hoc that suburban zones tolerated larger early-dispatch windows than dense urban cores.
- Add a dasher-facing comms update and reward structure (e.g., temporary wait-time bonuses funded by refund savings) to proactively manage perception during the first week of rollout.
---
Teaching notes and pitfalls
- Always define the metric you are optimizing and the guardrails you will not break. Tail metrics (P90/P95) often matter more than averages in delivery markets.
- Validate with counterfactual analysis and pre-registered A/B plans to avoid p-hacking. For fragile launches, add kill switches and revert criteria.
- Align incentives: if a policy risks lowering EPH, budget a temporary incentive or narrow the policy by merchant/daypart to protect fairness while you iterate.
- Communicate in trade-off language: what you gained, what you knowingly gave up, and why the net was positive for customers, dashers, and the business.