PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates proficiency with analytical SQL and data engineering concepts, including CTE design, window functions (LAG/QUALIFY), timezone-aware timestamp handling, aggregation and rate calculation, per-order interval computation, and distributional comparison (median).

  • Medium
  • DoorDash
  • Data Manipulation (SQL/Python)
  • Data Scientist

Write SQL for cold-complaint diagnostics with LAG/QUALIFY

Company: DoorDash

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Using BigQuery/Snowflake-style SQL (CTEs required; use LAG and QUALIFY), answer the tasks below. Assume 'today' is 2025-09-01. Schema and small samples: Tables: - deliveries(delivery_id STRING, order_id STRING, courier_id STRING, restaurant_id STRING, city STRING, pickup_ts TIMESTAMP, dropoff_ts TIMESTAMP, distance_km FLOAT, status STRING) - complaints(complaint_id STRING, order_id STRING, complaint_ts TIMESTAMP, type STRING) - order_events(order_id STRING, event_ts TIMESTAMP, status STRING) -- status in ('ready','picked_up','delivered') - couriers(courier_id STRING, city STRING, has_insulated_bag BOOLEAN, activation_date DATE) Samples (ASCII): - deliveries delivery_id | order_id | courier_id | restaurant_id | city | pickup_ts | dropoff_ts | distance_km | status D1 | 1001 | C1 | R1 | SF | 2025-09-01 12:05:00 | 2025-09-01 12:25:00 | 3.2 | completed D2 | 1002 | C2 | R2 | SF | 2025-09-01 12:10:00 | 2025-09-01 12:55:00 | 7.8 | completed D3 | 1003 | C1 | R3 | NYC | 2025-08-31 19:40:00 | 2025-08-31 20:20:00 | 5.1 | completed D4 | 1004 | C3 | R4 | NYC | 2025-08-31 19:50:00 | 2025-08-31 20:10:00 | 2.0 | completed D5 | 1005 | C2 | R2 | SF | 2025-08-31 12:00:00 | 2025-08-31 12:40:00 | 6.3 | completed - complaints complaint_id | order_id | complaint_ts | type A1 | 1002 | 2025-09-01 13:10:00 | cold A2 | 1003 | 2025-08-31 21:00:00 | cold A3 | 1004 | 2025-08-31 20:45:00 | late - order_events order_id | event_ts | status 1001 | 2025-09-01 12:00:00 | ready 1001 | 2025-09-01 12:05:00 | picked_up 1001 | 2025-09-01 12:25:00 | delivered 1002 | 2025-09-01 12:35:00 | ready 1002 | 2025-09-01 12:40:00 | picked_up 1002 | 2025-09-01 12:55:00 | delivered 1003 | 2025-08-31 19:20:00 | ready 1003 | 2025-08-31 19:40:00 | picked_up 1003 | 2025-08-31 20:20:00 | delivered 1004 | 2025-08-31 19:30:00 | ready 1004 | 2025-08-31 19:50:00 | picked_up 1004 | 2025-08-31 20:10:00 | delivered - couriers courier_id | city | has_insulated_bag | activation_date C1 | SF | TRUE | 2025-06-01 C2 | SF | FALSE | 2025-07-15 C3 | NYC | TRUE | 2025-03-10 Assumptions: - Consider only deliveries.status = 'completed'. Count at most one 'cold' complaint per order, within 24h of dropoff. Tasks (write one SQL script with CTEs): A) Compute city-day cold complaint rate for the last 30 days ending 2025-09-01 (complaints per completed delivery). Return city, date, deliveries, cold_complaints, complaint_rate. Ensure time zones are handled by casting dropoff_ts to city-local date; state any assumption you make for timezone. B) Using order_events and LAG over (PARTITION BY order_id ORDER BY event_ts), compute per order: ready_to_pickup_min and pickup_to_dropoff_min. Then, for 2025-08-01 to 2025-09-01, compare median pickup_to_dropoff_min between orders with a cold complaint vs. without, per city. Keep only city rows where the difference > 12 minutes using QUALIFY on a window over cities. C) For each courier, consider their last 100 completed deliveries up to 2025-09-01 23:59:59 in their city. Flag couriers with cold complaint rate > 2× the city median in the same period and at least 100 deliveries. Use QUALIFY to (i) pick the last 100 by dropoff_ts per courier and (ii) filter the outliers. Return courier_id, city, deliveries, cold_complaints, rate, city_median_rate. D) Compute, per city, the P90 of pickup_to_dropoff_min. Then, for each courier, compute the share of their deliveries exceeding their city’s P90, and QUALIFY the top 5 couriers per city by that share (ties broken by higher deliveries). Explain any tie-breaking window you use.

Quick Answer: This question evaluates proficiency with analytical SQL and data engineering concepts, including CTE design, window functions (LAG/QUALIFY), timezone-aware timestamp handling, aggregation and rate calculation, per-order interval computation, and distributional comparison (median).

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Calculate Order Request Metrics - DoorDash (hard)
  • Analyze Restaurant Customer Metrics - DoorDash (medium)
  • Write SQL for monthly spend and ratios - DoorDash (medium)
  • Write SQL for late-delivery metrics by window - DoorDash (Medium)
  • Compute rolling cold-delivery rates with windows - DoorDash (Medium)