Write SQL window functions for streaks
Company: Flatiron Health
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Assume today is 2025-09-01. Using standard SQL (e.g., PostgreSQL), answer the following using window functions only (no procedural loops, no correlated subqueries). Schema and small samples:
Schema
- users(user_id INT, signup_date DATE)
- events(user_id INT, event_date DATE, event_name TEXT) -- one row per user per day per event
- orders(order_id INT, user_id INT, order_date DATE, amount DECIMAL(8,2))
Sample rows
users
user_id | signup_date
1 | 2025-08-20
2 | 2025-08-22
3 | 2025-08-25
events
user_id | event_date | event_name
1 | 2025-08-29 | login
1 | 2025-08-30 | login
1 | 2025-09-01 | login
2 | 2025-08-26 | login
2 | 2025-08-28 | login
2 | 2025-08-29 | login
2 | 2025-09-01 | login
3 | 2025-08-25 | login
3 | 2025-08-26 | login
3 | 2025-08-27 | login
3 | 2025-09-01 | login
orders
order_id | user_id | order_date | amount
10 | 1 | 2025-08-30 | 20.00
11 | 2 | 2025-08-27 | 15.00
12 | 2 | 2025-09-01 | 30.00
13 | 3 | 2025-08-26 | 12.00
Tasks
A) Return, for each user, the first 3-day consecutive activity streak whose streak-end falls within 2025-08-26 to 2025-09-01 inclusive. Activity is defined as having at least one events row (any event_name) on a date. Output: user_id, streak_start_date, streak_end_date. Use only window functions and date arithmetic (no self-joins on dates tables).
B) For each calendar day d in 2025-08-26..2025-09-01, compute: (i) DAU = distinct users with any event on d, (ii) revenue_d = sum of orders.amount with order_date = d, and (iii) rolling_7d_DAU_avg = 7-day trailing average of DAU ending on d. Also compute revenue_per_active_user = revenue_d / DAU with 4-decimal precision, treating division-by-zero as NULL. Return day, DAU, revenue_d, rolling_7d_DAU_avg, revenue_per_active_user.
C) Flag users who placed any order on date t and then had a strict 10-day inactivity gap (no events) immediately after t, followed by any event on a later date u. Return user_id, last_order_date = t, gap_days, first_post_gap_event_date = u. Only the earliest such gap per user.
Quick Answer: This question evaluates proficiency with SQL window functions, date arithmetic, gap-and-island (streak) detection, rolling aggregates, and event-order sequencing to derive DAU, revenue metrics, and inactivity gaps.