
You’re given two tables.
users(id INT, country STRING, created_at DATE) rows: 1 | US | 2025-08-28 2 | US | 2025-08-30 3 | IN | 2025-08-29 4 | BR | 2025-08-31
orders(order_id INT, user_id INT, amount DECIMAL, created_at DATE) rows: 101 | 1 | 20.00 | 2025-08-31 102 | 1 | 10.00 | 2025-09-01 103 | 2 | NULL | 2025-09-01 104 | 3 | 15.00 | 2025-09-01
events(user_id INT, event_name STRING, event_time TIMESTAMP) rows: 1 | video_play | 2025-09-01 10:01:00 1 | like | 2025-09-01 10:02:00 2 | video_play | 2025-09-01 11:00:00 3 | video_play | 2025-09-01 12:00:00 3 | video_play | 2025-09-01 12:05:00 4 | video_play | 2025-09-01 13:00:00
Answer all parts concisely and provide the exact SQL: (a) Explain and demonstrate, using a single grouped query by users.country, the differences among COUNT(*), COUNT(amount), and COUNT(DISTINCT user_id) on orders joined to users. Why do NULLs matter? Show the three counts side-by-side and ensure the join does not double-count users with multiple orders. (b) Compute, per country, the top 2 spenders by total order amount using ROW_NUMBER(). Then repeat using RANK() so that ties are included even if more than 2 users are returned. For country = 'US', list which user_ids appear under each method and why. (c) Show a pitfall where filtering in WHERE vs HAVING changes results: return users with at least 2 video_play events on 2025-09-01 who have zero non-NULL orders. First, incorrectly filter orders.amount IS NULL in WHERE before aggregation; then correct it using HAVING. Explain the difference in row counts. (d) Compute average spend per active user (active = at least one event of any type on 2025-09-01) by country, guarding against divide-by-zero and NULL amounts. Use COALESCE and NULLIF appropriately and justify your choices.