Write SQL to flag Venmo ATO
Company: PayPal
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
SQL case: You are a Decision Scientist on Venmo’s Fraud (ATO) team. Using the schema and sample data below, write a single Standard SQL query that returns candidate ATO events in the last 7 days (treat "today" as 2025-09-01 00:00:00 UTC; last 7 days window is [2025-08-25, 2025-09-01)). A candidate ATO event is defined as: (a) the user’s first successful login from a NEW device_id OR a NEW ip in the prior 30 days, and (b) within 2 hours after that login, the user sends a transfer > $500 to a FIRST-TIME recipient for that sender. Exclude users whose account age < 7 days at login time. If multiple transfers match for the same login, keep only the earliest transfer. Output columns: user_id, login_ts, device_id, ip, reason_code ("NEW_DEVICE", "NEW_IP", or "BOTH"), tx_id, tx_ts, amount_usd, first_time_recipient_flag. Also: 1) ensure no duplicate rows; 2) be explicit about time boundaries; 3) at the end, list the exact indexes you’d recommend to support this query on a PostgreSQL-like warehouse.
Schema:
- users(user_id INT PRIMARY KEY, created_at TIMESTAMP)
- logins(user_id INT, ts TIMESTAMP, ip STRING, device_id STRING, success BOOLEAN)
- transfers(tx_id BIGINT PRIMARY KEY, sender_id INT, recipient_id INT, ts TIMESTAMP, amount_usd DECIMAL(10,2), status STRING) -- status in ('completed','reversed')
Sample data (UTC):
users
+---------+---------------------+
| user_id | created_at |
+---------+---------------------+
| 1 | 2025-07-01 10:00:00 |
| 2 | 2025-08-28 09:00:00 |
| 3 | 2025-05-15 12:00:00 |
| 4 | 2025-08-01 08:30:00 |
+---------+---------------------+
logins
+---------+---------------------+---------------+-----------+---------+
| user_id | ts | ip | device_id | success |
+---------+---------------------+---------------+-----------+---------+
| 1 | 2025-08-20 09:00:00 | 1.1.1.1 | A | TRUE |
| 1 | 2025-08-25 10:10:00 | 2.2.2.2 | B | TRUE |
| 1 | 2025-08-25 10:40:00 | 2.2.2.2 | B | TRUE |
| 2 | 2025-08-29 21:00:00 | 3.3.3.3 | C | TRUE |
| 2 | 2025-08-30 01:00:00 | 4.4.4.4 | C | TRUE |
| 3 | 2025-08-31 15:00:00 | 5.5.5.5 | D | TRUE |
| 3 | 2025-08-31 16:30:00 | 5.5.5.5 | E | TRUE |
| 4 | 2025-09-01 00:10:00 | 6.6.6.6 | F | FALSE |
+---------+---------------------+---------------+-----------+---------+
transfers
+--------+-----------+--------------+---------------------+------------+-----------+
| tx_id | sender_id | recipient_id | ts | amount_usd | status |
+--------+-----------+--------------+---------------------+------------+-----------+
| 101 | 1 | 9 | 2025-08-25 11:15:00 | 650.00 | completed |
| 102 | 1 | 9 | 2025-08-26 09:00:00 | 50.00 | completed |
| 103 | 2 | 8 | 2025-08-30 01:45:00 | 800.00 | completed |
| 104 | 2 | 7 | 2025-08-27 10:00:00 | 20.00 | completed |
| 105 | 3 | 6 | 2025-08-31 16:45:00 | 700.00 | completed |
| 106 | 3 | 6 | 2025-09-01 01:00:00 | 10.00 | reversed |
| 107 | 3 | 5 | 2025-08-10 08:00:00 | 5.00 | completed |
| 108 | 4 | 5 | 2025-09-01 00:20:00 | 900.00 | completed |
+--------+-----------+--------------+---------------------+------------+-----------+
Definitions:
- NEW device/ip: not seen for that user in the 30 days BEFORE the login ts (lookback window is (login_ts - 30 days, login_ts)).
- FIRST-TIME recipient: no prior transfer from sender_id to recipient_id before tx_ts.
Deliverables:
A) The SQL producing the specified output; B) Brief justification for your index choices.
Quick Answer: This question evaluates a candidate's ability to construct complex SQL to detect account-takeover (ATO) signals in a payments platform, assessing skills in event-time windowing, deduplication, joins across user/login/transaction tables, and categorical flagging of new devices or IPs.