Compute ARPDAU/ARPPU by country
Company: Roblox
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Using only the two tables below, write a single SQL query that returns, for each day and country in the last 7 days inclusive (2025-08-26 through 2025-09-01), the following columns: dt, country, dau (distinct users with at least one session that day), payers (distinct users with ≥1 payment that joins to a session that day), revenue_usd (sum of payment amounts joined to sessions that day), arpdau (revenue_usd/dau), and arppu (revenue_usd/payers). Join payments to sessions on the multiple-key (user_id, session_id). Treat dt as DATE(session_start_ts). Assume all timestamps are UTC. Prevent double counting when (a) a session row is duplicated in app_sessions, and (b) a session has multiple payment rows in payments. Only include payments that successfully join to a session; ignore unmatched payments. Provide the query and, in one or two sentences, explain how you avoided double counting. Schema and small sample data: app_sessions(user_id INT, session_id STRING, session_start_ts TIMESTAMP, country STRING)
app_sessions (sample)
+---------+------------+---------------------+---------+
| user_id | session_id | session_start_ts | country |
+---------+------------+---------------------+---------+
| 1 | s1 | 2025-08-31 23:55:00 | US |
| 1 | s2 | 2025-09-01 00:10:00 | US |
| 2 | s3 | 2025-09-01 13:05:00 | CA |
| 2 | s3 | 2025-09-01 13:05:00 | CA | <-- duplicate session row
| 3 | s4 | 2025-08-27 09:00:00 | US |
+---------+------------+---------------------+---------+
payments(user_id INT, session_id STRING, amount_usd DECIMAL(10,2), purchased_at TIMESTAMP)
payments (sample)
+---------+------------+------------+---------------------+
| user_id | session_id | amount_usd | purchased_at |
+---------+------------+------------+---------------------+
| 1 | s2 | 4.99 | 2025-09-01 00:12:00 |
| 1 | s2 | 1.99 | 2025-09-01 00:13:00 | <-- multiple payments same session
| 2 | s3 | 2.99 | 2025-09-01 13:10:00 |
| 3 | s5 | 9.99 | 2025-08-27 09:05:00 | <-- no matching session
+---------+------------+------------+---------------------+
Quick Answer: This question evaluates data manipulation and analytics skills for computing user- and revenue-based metrics (ARPDAU, ARPPU), focusing on aggregation, multi-key joins, deduplication, time-window filtering, and correct revenue attribution.