You are given user and transaction data. Part A (SQL): Use a join and window functions to answer the prompts below using the following schema and sample data.
Schema:
Users(user_id INT PRIMARY KEY, name TEXT, country TEXT)
Transactions(transaction_id INT PRIMARY KEY, from_user_id INT, to_user_id INT, amount INT, created_at TIMESTAMP)
Sample tables:
Users
+---------+-------+---------+
| user_id | name | country |
+---------+-------+---------+
| 1 | Alice | US |
| 2 | Bob | US |
| 3 | Chen | CN |
| 4 | Divya | IN |
| 5 | Emma | US |
+---------+-------+---------+
Transactions
+----------------+--------------+------------+--------+---------------------+
| transaction_id | from_user_id | to_user_id | amount | created_at |
+----------------+--------------+------------+--------+---------------------+
| 1001 | 1 | 2 | 50 | 2025-08-30 10:00:00 |
| 1002 | 2 | 1 | -20 | 2025-08-31 09:00:00 |
| 1003 | 3 | 1 | 10 | 2025-09-01 11:00:00 |
| 1004 | 1 | 3 | -5 | 2025-09-02 12:00:00 |
| 1005 | 4 | 5 | 15 | 2025-09-03 08:00:00 |
+----------------+--------------+------------+--------+---------------------+
Interpretation rule: If amount < 0, the actual sender is to_user_id and the actual receiver is from_user_id, and the normalized amount is ABS(amount).
Tasks:
-
Write a single SQL query that returns, for every row in Transactions, the columns: sender_user_id, receiver_user_id, normalized_amount, created_at, sender_country, receiver_country. Use a join to Users and a CASE expression (or equivalent) to normalize direction based on the sign of amount.
-
Using window functions, produce a result set with columns (country, user_id, sent_txn_count, country_rank_by_sent) where sent_txn_count is the number of transactions the user sent (after direction normalization) and country_rank_by_sent is the dense rank of the user within their country ordered by sent_txn_count desc, then user_id asc for tie-breaks. Return only the top 3 users per country.
Part B (Python): You receive an iterable of records formatted as (transaction_id: int, pair: str "from_user_id:to_user_id", amount: int). Apply the same direction rule as above (amount < 0 means the transaction direction is reversed). Implement an efficient function that returns two lists: top 5 sender user_ids and top 5 receiver user_ids ranked by their respective transaction counts (descending count, then ascending user_id as a tie-breaker). Aim for O(n) time and O(u) additional space where u is the number of unique users; do not materialize per-transaction expansions beyond what is necessary.