Write SQL to monitor weekly chargeback spikes
Company: Stripe
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
Write a single SQL query (PostgreSQL) to detect weekly chargeback spikes by country and industry. Week starts Monday (use date_trunc('week', ts) with ISO weeks). Schema and sample data:
Tables
- merchants(merchant_id INT, country_code TEXT, industry TEXT)
- payments(payment_id INT, merchant_id INT, created_at TIMESTAMP, amount_usd NUMERIC, status TEXT CHECK (status IN ('succeeded','failed')))
- chargebacks(chargeback_id INT, payment_id INT, created_at TIMESTAMP)
Sample rows
merchants
+-------------+--------------+----------+
| merchant_id | country_code | industry |
+-------------+--------------+----------+
| 1 | US | SaaS |
| 2 | US | Retail |
| 3 | DE | Retail |
| 4 | US | SaaS |
+-------------+--------------+----------+
payments
+------------+-------------+---------------------+------------+-----------+
| payment_id | merchant_id | created_at | amount_usd | status |
+------------+-------------+---------------------+------------+-----------+
| 101 | 1 | 2025-08-18 10:00:00 | 120.00 | succeeded |
| 102 | 1 | 2025-08-19 11:00:00 | 130.00 | succeeded |
| 103 | 2 | 2025-08-26 09:00:00 | 80.00 | failed |
| 104 | 2 | 2025-08-26 12:00:00 | 200.00 | succeeded |
| 105 | 3 | 2025-08-27 13:00:00 | 75.00 | succeeded |
| 106 | 4 | 2025-09-02 10:00:00 | 500.00 | succeeded |
| 107 | 1 | 2025-09-02 12:00:00 | 150.00 | succeeded |
+------------+-------------+---------------------+------------+-----------+
chargebacks
+----------------+------------+---------------------+
| chargeback_id | payment_id | created_at |
+----------------+------------+---------------------+
| 9001 | 101 | 2025-09-01 08:00:00 |
| 9002 | 104 | 2025-09-08 09:00:00 |
+----------------+------------+---------------------+
Task
Return, for each (week_start, country_code, industry):
- gmv_usd: sum(amount_usd) of succeeded payments created in that week
- active_merchants: distinct merchants with ≥1 succeeded payment that week
- succeeded_cnt: count of succeeded payments that week
- cb_cnt_lagwin: chargebacks created during that week whose underlying payment was created in the lag window [week_start-14 days, week_start-7 days]
- cb_rate_lagwin: cb_cnt_lagwin / succeeded_cnt for payments in that lag window
- wow_cb_rate_pct_increase: week-over-week percentage change of cb_rate_lagwin
- top_merchant_share_gmv: in-week max(merchant_gmv)/gmv_usd
Filter to rows where wow_cb_rate_pct_increase > 50% AND top_merchant_share_gmv < 0.40. Handle late-arriving chargebacks by joining chargebacks to payments via payment_id. Assume UTC throughout. Use CTEs; no temp tables. Also, based on the sample data, state which (week_start, country_code, industry) rows should be returned and why (briefly).
Quick Answer: This question evaluates proficiency in PostgreSQL time-series analytics, specifically aggregations, joins across payments and chargebacks, handling late-arriving events, and calculating per-week metrics such as GMV, active merchants, and chargeback rates.