You are analyzing a food-delivery marketplace.
Assume the following schema (you may add minor helper CTEs as needed):
ordersorder_id
(BIGINT, PK)
customer_id
(BIGINT)
restaurant_id
(BIGINT)
order_ts
(TIMESTAMP) — time the order was placed
order_amount
(NUMERIC) — total GMV for the order
customerscustomer_id
(BIGINT, PK)
restaurantsrestaurant_id
(BIGINT, PK)
order_ts
(assume UTC unless otherwise stated).
For each month, compute the percentage of distinct customers who are high-frequency customers.
Output:
month
(DATE or TIMESTAMP truncated to month)
pct_high_frequency_customers
For each month, find the customer(s) with the highest monthly order count among customers with monthly order count ≤ 30.
Output:
month
customer_id
monthly_order_count
Follow-up: Identify the single customer (or customers, if tied) with the most total orders across all months (no exclusion).
Given a specific restaurant_id, compute monthly total sales in 2021 and the month-over-month (MoM) change, excluding the first month in the series (i.e., only months where a prior month exists).
Output:
year_month
restaurant_id
monthly_sales
mom_sales_change
(current month sales − previous month sales)
Follow-up: How would you change the query to return the MoM change for all restaurants?
For each month, rank restaurants by that month’s total sales and label restaurants into quartiles (4 buckets) within the month.
Define “bottom quartile restaurants” as the lowest 25% by monthly sales for that month.
Compute, for each month, the percentage of distinct customers who placed at least one order from a bottom-quartile restaurant.
Output:
month
pct_customers_ordered_bottom_quartile
Notes:
NTILE(4)
over monthly restaurant sales).