This question evaluates proficiency with SQL aggregation and JOINs, analytic/window functions (RANK()), tie and NULL/edge-case handling, and percent-of-total calculations for per-customer spend analysis.
Using the schema and sample data below, write a single ANSI SQL query (CTEs allowed; no temp tables) that returns, for each customer, their top 2 product categories by total spend. Output columns: customer_id, category, total_spend, earliest_order_date, rank_in_customer. Requirements:
Schema: customers(id INT PK, name TEXT) orders(id INT PK, customer_id INT FK -> customers.id, order_date DATE) order_items(id INT PK, order_id INT FK -> orders.id, product_id INT FK -> products.id, qty INT, unit_price DECIMAL(10,2)) products(id INT PK, category TEXT, name TEXT)
Sample data: customers id | name 1 | Alice 2 | Bob 3 | Cara 4 | Dana
orders id | customer_id | order_date 101 | 1 | 2025-07-02 102 | 1 | 2025-07-15 103 | 2 | 2025-07-20 104 | 2 | 2025-07-21 105 | 3 | 2025-07-22
order_items id | order_id | product_id | qty | unit_price 1001 | 101 | 201 | 2 | 10.00 1002 | 101 | 202 | 1 | 20.00 1003 | 102 | 201 | 1 | 10.00 1004 | 102 | 203 | 5 | 5.00 1005 | 103 | 202 | 2 | 20.00 1006 | 104 | 203 | 10 | 5.00 1007 | 105 | 204 | 1 | 100.00
products id | category | name 201 | CatA | A 202 | CatA | B 203 | CatB | C 204 | CatC | D