You received a take-home dataset with order-, store-, and dasher-level features to predict delivery ETA (minutes from created_at to delivered_at). Deliverables: A) Problem framing: Define the target precisely and list at least 10 features across demand, supply, and network (e.g., historical prep time by merchant-hour, driver density within 3 km in last 10 minutes, rain indicator, queue depth at store, distance via road graph, time-of-day, promo active, cuisine, orders-in-batch). B) Leakage and splitting: Identify all likely leakage sources (e.g., features derived after pickup) and propose a time-based CV (e.g., rolling-origin) with train=[Aug 1–24, 2025], valid=[Aug 25–31], test=[Sep 1–7]. Justify any domain adaptation if training on other cities. C) Modeling: Compare GBM (e.g., XGBoost/LightGBM) vs gradient-boosted quantile model for P50/P90; justify loss choices (MAE, Huber, pinball). Include feature interactions you would engineer and regularization you’d tune. D) Evaluation: Report MAE, median AE, P90 AE, coverage of 80% prediction intervals, and calibration plots. Describe how you’d compute calibration error and reliability curves. E) Decisioning: Show how ETA error impacts dispatch decisions (late-delivery penalties vs courier idle cost). Propose a cost-sensitive objective or post-hoc thresholding that minimizes expected cost under asymmetric penalties. F) Explainability and fairness: Use SHAP or permutation importance to audit features; outline checks for bias across neighborhoods or vehicle types and how you’d mitigate (e.g., monotonic constraints, group calibration). G) Production: Outline a feature store, streaming inference latency budget, model retraining cadence, drift detection (PSI/KS on key features), and an online A/B plan to validate offline gains while watching guardrails.

This question evaluates machine learning competencies for ETA prediction, including feature engineering, leakage detection, time-based validation, point and quantile modeling, evaluation and calibration metrics, cost-sensitive decisioning, explainability and fairness auditing, and production deployment considerations.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Onsite rounds at DoorDash.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at DoorDash during technical interviews.

Build ETA prediction and simulate impact | DoorDash Interview Question

Predicting Delivery ETA (Minutes)

Context

You are given a take-home dataset with order-, store-, and dasher-level features. The goal is to predict delivery ETA defined as minutes from order created_at to delivered_at. Assume you must generate predictions using only information available at the prediction timestamp t0 (e.g., at order creation or at dispatch assignment).

Deliverables

A) Problem framing

Define the target precisely (unit, timestamp of prediction, censoring/exclusions).
Propose at least 10 features spanning demand, supply, and network (e.g., historical prep time by merchant-hour, driver density within 3 km in the last 10 minutes, rain indicator, queue depth at store, distance via road graph, time-of-day, promo active, cuisine, orders-in-batch).

B) Leakage and splitting

Identify likely leakage sources (e.g., features derived after pickup or after t0) and how to prevent them.
Propose a time-based cross-validation scheme (e.g., rolling-origin) with an example split: train=[Aug 1–24, 2025], valid=[Aug 25–31], test=[Sep 1–7].
Justify any domain adaptation if training on other cities.

C) Modeling

Compare gradient-boosted trees (e.g., XGBoost/LightGBM) for point prediction vs gradient-boosted quantile models for P50/P90.
Justify loss choices (MAE, Huber, pinball). List key feature interactions and regularization to tune.

D) Evaluation

Report MAE, median absolute error, P90 absolute error, coverage of 80% prediction intervals, and calibration plots.
Describe how to compute calibration error and reliability curves.

E) Decisioning

Explain how ETA error impacts dispatch decisions (late-delivery penalties vs courier idle cost).
Propose a cost-sensitive objective or post-hoc thresholding that minimizes expected cost under asymmetric penalties.

F) Explainability and fairness

Use SHAP or permutation importance to audit features.
Outline checks for bias across neighborhoods or vehicle types and how to mitigate (e.g., monotonic constraints, group calibration).

G) Production

Outline a feature store, streaming inference latency budget, model retraining cadence, drift detection (PSI/KS on key features), and an online A/B plan to validate offline gains while monitoring guardrails.

Predicting Delivery ETA (Minutes)

Context

Deliverables

A) Problem framing

Define the target precisely (unit, timestamp of prediction, censoring/exclusions).
Propose at least 10 features spanning demand, supply, and network (e.g., historical prep time by merchant-hour, driver density within 3 km in the last 10 minutes, rain indicator, queue depth at store, distance via road graph, time-of-day, promo active, cuisine, orders-in-batch).

B) Leakage and splitting

Identify likely leakage sources (e.g., features derived after pickup or after t0) and how to prevent them.
Propose a time-based cross-validation scheme (e.g., rolling-origin) with an example split: train=[Aug 1–24, 2025], valid=[Aug 25–31], test=[Sep 1–7].
Justify any domain adaptation if training on other cities.

C) Modeling

Compare gradient-boosted trees (e.g., XGBoost/LightGBM) for point prediction vs gradient-boosted quantile models for P50/P90.
Justify loss choices (MAE, Huber, pinball). List key feature interactions and regularization to tune.

D) Evaluation

Report MAE, median absolute error, P90 absolute error, coverage of 80% prediction intervals, and calibration plots.
Describe how to compute calibration error and reliability curves.

E) Decisioning

Explain how ETA error impacts dispatch decisions (late-delivery penalties vs courier idle cost).
Propose a cost-sensitive objective or post-hoc thresholding that minimizes expected cost under asymmetric penalties.

F) Explainability and fairness

Use SHAP or permutation importance to audit features.
Outline checks for bias across neighborhoods or vehicle types and how to mitigate (e.g., monotonic constraints, group calibration).

G) Production

Outline a feature store, streaming inference latency budget, model retraining cadence, drift detection (PSI/KS on key features), and an online A/B plan to validate offline gains while monitoring guardrails.

Build ETA prediction and simulate impact

Quick Overview

Build ETA prediction and simulate impact

Predicting Delivery ETA (Minutes)

Context

Deliverables

Write your answer

Build ETA prediction and simulate impact

Quick Overview

Build ETA prediction and simulate impact

Predicting Delivery ETA (Minutes)

Context

Deliverables

Write your answer