Using the schema and sample data below, write a single ANSI SQL query (CTEs allowed; no temp tables) that returns, for each customer, their top 2 product categories by total spend. Output columns: customer_id, category, total_spend, earliest_order_date, rank_in_customer.
Requirements:
-
total_spend = SUM(oi.qty * oi.unit_price) per (customer_id, category).
-
earliest_order_date = MIN(o.order_date) within that (customer_id, category).
-
Use RANK() OVER (PARTITION BY customer_id ORDER BY total_spend DESC, earliest_order_date ASC) as rank_in_customer.
-
Return rows where rank_in_customer <= 2, including all ties at rank 2.
-
Include customers with no orders as a single row: category = NULL, total_spend = 0, earliest_order_date = NULL, rank_in_customer = NULL (do not assign rank for zero-spend customers). Hint: careful LEFT JOINs and conditional ranking.
-
Avoid vendor-specific extensions beyond standard window functions.
Follow-ups (closely related; keep as part of the same query via additional columns/CTEs):
A) Add percent_of_total = total_spend / SUM(total_spend) OVER (PARTITION BY customer_id), and ensure it is 0 for customers with no orders.
B) Return only customer_ids whose top category’s percent_of_total < 0.5 (i.e., their spend is not dominated by a single category).
Schema:
customers(id INT PK, name TEXT)
orders(id INT PK, customer_id INT FK -> customers.id, order_date DATE)
order_items(id INT PK, order_id INT FK -> orders.id, product_id INT FK -> products.id, qty INT, unit_price DECIMAL(10,2))
products(id INT PK, category TEXT, name TEXT)
Sample data:
customers
id | name
1 | Alice
2 | Bob
3 | Cara
4 | Dana
orders
id | customer_id | order_date
101 | 1 | 2025-07-02
102 | 1 | 2025-07-15
103 | 2 | 2025-07-20
104 | 2 | 2025-07-21
105 | 3 | 2025-07-22
order_items
id | order_id | product_id | qty | unit_price
1001 | 101 | 201 | 2 | 10.00
1002 | 101 | 202 | 1 | 20.00
1003 | 102 | 201 | 1 | 10.00
1004 | 102 | 203 | 5 | 5.00
1005 | 103 | 202 | 2 | 20.00
1006 | 104 | 203 | 10 | 5.00
1007 | 105 | 204 | 1 | 100.00
products
id | category | name
201 | CatA | A
202 | CatA | B
203 | CatB | C
204 | CatC | D