This question evaluates proficiency in aggregation, cohort analysis, date/time arithmetic, and retention metric calculation within the Data Manipulation (SQL/Python) domain at an intermediate level.
You are interviewing for a Product/Risk Data Scientist role at a crypto exchange.
Assume all timestamps are in UTC.
trades(
user_id BIGINT,
trade_time TIMESTAMP,
volume NUMERIC,
asset VARCHAR
)
users(
user_id BIGINT,
signup_date DATE
)
Define the last 30 days as [CURRENT_DATE - INTERVAL '30 day', CURRENT_DATE).
Compute the percentage of active traders in the last 30 days who made at least 3 trades in that same 30-day window.
Return one row with:
active_traders_30d
power_traders_30d
pct_power_traders_30d
(as a decimal between 0 and 1)
Define 7-day retention as: a signed-up user makes at least one trade on the 7th day after signup, i.e. on the calendar date signup_date + 7 (UTC date).
For each signup_date cohort, compute:
cohort_size
retained_users_d7
d7_retention_rate
Notes:
signup_date + 7
.
signup_date
and ordered by
signup_date
ascending.