You’re given an anonymized DoorDash dataset at order-creation time and asked to predict late delivery risk (late = actual_dropoff_time > quoted_dropoff_time). 1) Define the target precisely and propose a time-based train/validation/test split that avoids leakage from future information (include exact cut dates). 2) Enumerate at least 10 high-signal, production-safe features available at order creation (e.g., store historical on-time rate by hour-of-week, dasher supply-demand index in zone, restaurant prep-time quantiles, distance and traffic, weather, surge/boost, customer lateness tolerance proxy). 3) Identify at least 5 leakage hazards and how you’ll eliminate them (e.g., features derived from post-pickup events, future average wait, features computed with non-causal windows). 4) Choose evaluation metrics for ranking and calibration (e.g., AUROC, AUPRC, Brier, calibration slope), justify thresholds for operational actions, and quantify business impact using a cost matrix. 5) Describe an online ramp: shadow mode -> treatment gating -> A/B test with guardrails; how you’ll monitor drift and recalibrate (e.g., Platt/Isotonic, periodic time-split retraining, population shift alerts). 6) Explain how you’ll handle cold-start restaurants/cities and seasonality (hierarchical pooling, entity embeddings, time-of-week effects).

This question evaluates competencies in production-ready predictive modeling, including target definition, feature engineering, temporal train/validation splitting to avoid leakage, model evaluation and calibration, monitoring and ramp strategies, and translating probabilistic outputs into business-impacting decisions for a Data Scientist.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Onsite rounds at DoorDash.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at DoorDash during technical interviews.

Build a late-delivery risk model | DoorDash Interview Question

Predict Late Delivery Risk at Order Creation

Context

You are given an anonymized dataset of marketplace orders with timestamps, store/customer/market attributes, estimated quoted drop-off times (ETA shown at order creation), and realized actual drop-off times. The task is to build a production-ready model that predicts the probability an order will be delivered late when the order is created.

Late is defined by: actual_dropoff_time > quoted_dropoff_time.

Tasks

Target and Time Split

Precisely define the binary target label using only information available at order creation (e.g., whether to use the initial quote vs. updated quotes).
Propose a time-based train/validation/test split that avoids future-leakage and include exact cut dates.

Features

Enumerate at least 10 high-signal, production-safe features that are available at order creation (examples: store historical on-time rate by hour-of-week, dasher supply-demand index, prep-time quantiles, distance/traffic, weather, surge/boost, customer tolerance proxy).

Leakage Hazards

Identify at least 5 ways leakage could occur and how to eliminate each (e.g., post-pickup events, future averages, non-causal windows).

Evaluation and Business Impact

Choose evaluation metrics for both ranking and calibration (e.g., AUROC, AUPRC, Brier score, calibration slope/intercept).
Propose decision thresholds for operational actions and quantify expected business impact using a cost matrix.

Online Ramp and Monitoring

Describe an online ramp from shadow mode to controlled rollout, including guardrails.
Explain how you will monitor drift and recalibrate (e.g., Platt/Isotonic, periodic time-split retraining, population shift alerts).

Cold Start and Seasonality

Explain how you will handle new restaurants/cities and seasonality (e.g., hierarchical pooling, entity embeddings, time-of-week effects).

Predict Late Delivery Risk at Order Creation

Context

Late is defined by: actual_dropoff_time > quoted_dropoff_time.

Tasks

Target and Time Split

Precisely define the binary target label using only information available at order creation (e.g., whether to use the initial quote vs. updated quotes).
Propose a time-based train/validation/test split that avoids future-leakage and include exact cut dates.

Features

Enumerate at least 10 high-signal, production-safe features that are available at order creation (examples: store historical on-time rate by hour-of-week, dasher supply-demand index, prep-time quantiles, distance/traffic, weather, surge/boost, customer tolerance proxy).

Leakage Hazards

Identify at least 5 ways leakage could occur and how to eliminate each (e.g., post-pickup events, future averages, non-causal windows).

Evaluation and Business Impact

Choose evaluation metrics for both ranking and calibration (e.g., AUROC, AUPRC, Brier score, calibration slope/intercept).
Propose decision thresholds for operational actions and quantify expected business impact using a cost matrix.

Online Ramp and Monitoring

Describe an online ramp from shadow mode to controlled rollout, including guardrails.
Explain how you will monitor drift and recalibrate (e.g., Platt/Isotonic, periodic time-split retraining, population shift alerts).

Cold Start and Seasonality

Explain how you will handle new restaurants/cities and seasonality (e.g., hierarchical pooling, entity embeddings, time-of-week effects).

Build a late-delivery risk model

Quick Overview

Build a late-delivery risk model

Predict Late Delivery Risk at Order Creation

Context

Tasks

Write your answer

Build a late-delivery risk model

Quick Overview

Build a late-delivery risk model

Predict Late Delivery Risk at Order Creation

Context

Tasks

Write your answer