Write SQL to localize trading drop contributors
Company: Robinhood
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Use the schema and samples below. Unless stated, treat a user as active on a day if they attempted any order (any status). Dates are inclusive.
Schema (ANSI SQL-friendly):
- users(user_id, signup_date DATE, geo STRING, acquisition_channel STRING)
- accounts(account_id, user_id, kyc_status STRING, funding_method STRING, first_fund_date DATE)
- orders(order_id, account_id, ts TIMESTAMP, asset_class STRING, order_type STRING, side STRING, quantity DECIMAL, status STRING, reject_code STRING, platform STRING)
- releases(release_id, ts TIMESTAMP, feature_name STRING, platform STRING)
- market_index(date DATE, sp500_return DECIMAL, vix_close DECIMAL, is_holiday BOOLEAN)
Small ASCII samples:
users
user_id | signup_date | geo | acquisition_channel
1 | 2025-05-20 | US-CA | ads
2 | 2025-06-03 | US-NY | referral
3 | 2025-06-15 | US-TX | organic
4 | 2025-07-05 | US-CA | ads
5 | 2025-07-12 | US-FL | organic
accounts
account_id | user_id | kyc_status | funding_method | first_fund_date
10 | 1 | approved | ach | 2025-05-22
11 | 2 | approved | wire | 2025-06-04
12 | 3 | pending | ach | null
13 | 4 | approved | ach | 2025-07-07
14 | 5 | approved | card | 2025-07-13
orders
order_id | account_id | ts | asset_class | order_type | side | quantity | status | reject_code | platform
100 | 10 | 2025-06-10 15:20:00 | equity | market | buy | 10 | executed | null | ios
101 | 11 | 2025-06-12 10:05:00 | equity | limit | sell | 5 | executed | null | web
102 | 11 | 2025-07-11 09:33:00 | equity | market | buy | 8 | rejected | R201 | web
103 | 13 | 2025-07-12 09:35:00 | crypto | market | buy | 0.2 | executed | null | android
104 | 13 | 2025-07-24 13:02:00 | equity | limit | buy | 3 | canceled | null | android
105 | 14 | 2025-07-25 10:01:00 | equity | market | buy | 2 | rejected | R305 | ios
106 | 10 | 2025-07-26 11:20:00 | option | market | buy | 1 | executed | null | ios
107 | 11 | 2025-08-01 14:07:00 | equity | limit | buy | 4 | executed | null | web
108 | 11 | 2025-08-05 09:31:00 | equity | market | sell | 2 | rejected | R201 | web
109 | 10 | 2025-08-10 15:45:00 | equity | market | buy | 6 | executed | null | ios
110 | 14 | 2025-08-15 12:12:00 | equity | limit | buy | 1 | executed | null | ios
111 | 13 | 2025-08-21 10:00:00 | crypto | market | sell | 0.1 | executed | null | android
releases
release_id | ts | feature_name | platform
1 | 2025-07-10 08:00:00 | onboarding_v2 | ios
2 | 2025-07-10 08:00:00 | onboarding_v2 | android
3 | 2025-07-10 08:00:00 | onboarding_v2 | web
market_index
date | sp500_return | vix_close | is_holiday
2025-06-10 | 0.3 | 15.2 | 0
2025-06-12 | -0.4 | 18.1 | 0
2025-07-11 | -1.2 | 22.5 | 0
2025-07-24 | 0.1 | 17.0 | 0
2025-07-25 | 0.0 | 16.8 | 0
2025-08-01 | -0.2 | 19.3 | 0
2025-08-05 | -0.6 | 21.0 | 0
2025-08-10 | 0.5 | 15.0 | 0
2025-08-15 | 0.2 | 14.7 | 0
2025-08-21 | -0.1 | 14.9 | 0
Tasks (write ANSI SQL; use CTEs if helpful):
A) Daily executed_trades_per_active_user by platform and signup cohort (signup_month) for 2025-06-01–2025-08-21. Definitions: executed_trades = count of orders.status='executed'; active users = distinct users with any order that day (any status). Exclude market_index.is_holiday = true by left-joining on date.
B) Attribution: Using baseline window 2025-06-01–2025-06-28 and impact window 2025-07-24–2025-08-21, compute per-segment change by [platform × asset_class × order_type]. For each segment, report: baseline mean executed_trades/day, impact mean, absolute delta, and share of total drop. Return top 5 segments by absolute contribution.
C) Quality: Post 2025-07-10, list the top 10 reject_code values whose rejection rate increased the most, controlling for asset_class and platform (i.e., compare within each asset_class×platform cell). Return code, cell, baseline rate (2025-06-01–2025-07-09), post rate (2025-07-10–2025-08-21), and delta.
D) Optional: Produce a user-day dataset for ML uplift modeling with columns [date, user_id, platform, acquisition_channel, tenure_days, funded_flag, executed_trades, any_order_attempt, rejection_rate_day, vix_close, sp500_return, is_holiday] and target executed_trades>0.
Quick Answer: This question evaluates proficiency in SQL-based data manipulation and analytics, including cohorting by signup month, daily aggregation of executed trades per active user, segmentation by platform and geolocation, and competency with joins, timestamps, and operational flags.