Analyze Restaurant Customer Metrics
Company: DoorDash
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: medium
Interview Round: Technical Screen
You are given two tables for a food delivery platform.
`orders`(
- `order_id` BIGINT
- `customer_id` BIGINT
- `restaurant_id` BIGINT
- `order_ts` TIMESTAMP
- `order_amount` DECIMAL(10,2)
- `status` VARCHAR
)
`restaurants`(
- `restaurant_id` BIGINT
- `restaurant_name` VARCHAR
- `city` VARCHAR
)
Assume `orders.restaurant_id` references `restaurants.restaurant_id`. Use only rows where `status = 'completed'`. Treat each month as a calendar month in UTC.
Write SQL for the following tasks:
1. **Monthly percentage of high-frequency customers**
A high-frequency customer is a customer who places more than 30 completed orders in the same month. Return:
- `month`
- `total_customers`
- `high_frequency_customers`
- `pct_high_frequency`
2. **Top customer per month, excluding high-frequency users**
For each month, consider only customers with at most 30 completed orders in that month. Return the customer with the highest monthly order count. Break ties by the smaller `customer_id`. Return:
- `month`
- `customer_id`
- `order_count`
3. **Month-over-month sales growth for 2021**
Compute monthly gross sales as the sum of `order_amount` from completed orders in 2021. Return:
- `month`
- `monthly_sales`
- `previous_month_sales`
- `mom_growth_pct`
Define month-over-month growth as:
`(monthly_sales - previous_month_sales) / previous_month_sales`
If the previous month's sales are missing or equal to 0, return `NULL` for growth.
4. **Customer reach of bottom-quartile restaurants**
For each month, rank restaurants by that month's completed sales and split them into quartiles within the month, where the lowest-sales quartile is the bottom quartile. Compute how many distinct customers ordered from at least one bottom-quartile restaurant, and what share they represent among all monthly customers. Return:
- `month`
- `customers_reached`
- `total_customers`
- `pct_customer_reach`
Quick Answer: This question evaluates SQL and data-manipulation skills for a data scientist role, focusing on aggregations, joins, window functions, time-series grouping, ranking, and percentile/quartile segmentation to compute customer and sales metrics.