Calculate Order Request Metrics
Company: DoorDash
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: hard
Interview Round: Onsite
You are working with DoorDash order and delivery-request data. Write SQL to answer the questions below.
Tables:
1. `orders`
- `order_id` BIGINT, primary key
- `customer_id` BIGINT
- `merchant_id` BIGINT
- `dasher_id` BIGINT, nullable until assigned
- `created_at_utc` TIMESTAMP, when the order was placed
- `delivered_at_utc` TIMESTAMP, when the order was delivered; null if not delivered
- `order_status` VARCHAR, values include `completed`, `cancelled`, and `refunded`
2. `delivery_requests`
- `request_id` BIGINT, primary key
- `order_id` BIGINT, foreign key to `orders.order_id`
- `dasher_id` BIGINT, Dasher who received the offer
- `requested_at_utc` TIMESTAMP, when the offer/request was sent
- `offer_amount_cents` INTEGER, guaranteed offer amount shown to the Dasher
- `request_status` VARCHAR, values include `accepted`, `declined`, `expired`, and `cancelled`
Relationship and assumptions:
- One order can have zero or more delivery requests.
- Requests for the same order should be ordered by `requested_at_utc`, breaking ties by `request_id`.
- Analyze completed orders whose `created_at_utc` falls in a parameterized window `[start_ts_utc, end_ts_utc)`.
- All timestamps are stored in UTC. If a business-day filter is needed, convert to the `America/Los_Angeles` timezone before filtering.
- Parameters: `late_threshold_minutes` and `request_threshold`.
Questions:
1. Calculate the percentage of completed orders that were late, where a late order is defined as an order whose delivery time is more than `late_threshold_minutes` after `created_at_utc`.
- Required output columns: `total_completed_orders`, `late_orders`, `pct_late_orders`.
2. Calculate the percentage of completed orders that had at least `request_threshold` delivery requests. Also calculate the average number of delivery requests per completed order.
- Include completed orders with zero requests in the denominator.
- Required output columns: `total_completed_orders`, `orders_with_at_least_threshold_requests`, `pct_orders_with_at_least_threshold_requests`, `avg_requests_per_order`.
3. Calculate the percentage of completed orders that had at least one offer increase. An offer increase occurs when a request's `offer_amount_cents` is greater than the immediately previous request's `offer_amount_cents` for the same order.
- Also calculate the average percentage offer increase across positive increase events, defined as `(current_offer_amount_cents - previous_offer_amount_cents) / previous_offer_amount_cents`.
- Exclude transitions where the previous offer amount is null or zero from the percentage-increase calculation.
- Required output columns: `total_completed_orders`, `orders_with_offer_increase`, `pct_orders_with_offer_increase`, `avg_pct_offer_increase`.
Return one query or a set of queries that produces these outputs.
Quick Answer: This question evaluates SQL and data-manipulation competencies for computing order-level metrics, including aggregations, joins, windowed event sequencing, timestamp/timezone handling, and NULL-aware comparisons.