Analyze Transaction Flow and User Engagement Efficiently
Company: Robinhood
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
transactions
+---------------+-------------+------------+--------+
| transaction_id| from_user_id| to_user_id | amount |
+---------------+-------------+------------+--------+
| 1 | 101 | 202 | 50 |
| 2 | 303 | 101 | -20 |
| 3 | 101 | 404 | 30 |
| 4 | 202 | 303 | 100 |
| 5 | 404 | 101 | -60 |
+---------------+-------------+------------+--------+
users
+---------+----------+
| user_id | name |
+---------+----------+
| 101 | Alice |
| 202 | Bob |
| 303 | Carol |
| 404 | Dave |
+---------+----------+
##### Scenario
A peer-to-peer payments start-up wants quick insights from its transaction log and user directory to understand money flow and user engagement.
##### Question
SQL – Using tables users(user_id, name) and transactions(transaction_id, from_user_id, to_user_id, amount), write a query that joins the two tables to return each user’s name, total_amount_sent and total_amount_received.
SQL – Add a window function to the previous query to show, for every transaction, the running cumulative amount_sent per from_user_id ordered by transaction_id.
Python – Given the same transactions dataframe, output two Series: the top 5 user_ids by number of sent transactions and the top 5 user_ids by number of received transactions (treat negative amounts as money flowing in the opposite direction).
##### Hints
For SQL, think LEFT JOINs/Aggregate and SUM with PARTITION BY. For Python, normalise sender/receiver per row, then value_counts.
Quick Answer: This question evaluates proficiency in SQL joins, aggregations and window functions alongside Python data-manipulation skills to compute per-user totals, running cumulative amounts, and leaderboards of senders and receivers.