Calculate Late Delivery Percentage and Top Customers
Company: DoorDash
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Orders
+-----------+-------------+------------------------+------------------------+
| order_id | customer_id | expected_delivery_date | actual_delivery_date |
+-----------+-------------+------------------------+------------------------+
| 101 | 7 | 2023-05-01 | 2023-05-03 |
| 102 | 12 | 2023-05-02 | 2023-05-02 |
| 103 | 8 | 2023-05-03 | 2023-05-05 |
| 104 | 15 | 2023-05-04 | 2023-05-04 |
| 105 | 7 | 2023-05-05 | 2023-05-08 |
+-----------+-------------+------------------------+------------------------+
##### Scenario
An e-commerce platform wants to monitor delivery performance over the past month.
##### Question
Given a table Orders with expected_delivery_date and actual_delivery_date, write a SQL query that returns the percentage of orders delivered late in the last 30 days.
Follow-up: List the top three customers with the highest number of late deliveries in that period.
##### Hints
Filter by DATE_DIFF or > CURRENT_DATE-30, count late vs total, then GROUP BY customer_id and ORDER BY late_count DESC.
Quick Answer: This question evaluates data manipulation and aggregation skills for a Data Scientist role, focusing on date handling, computing the percentage of late orders, and ranking customers using SQL or Python.