Write SQL for repeat churn
Company: OpenAI
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: hard
Interview Round: Technical Screen
Write a SQL query to measure the performance of a free-month promotion experiment.
Assume `experiment_users` already contains only users who were eligible for the experiment. Do not re-implement eligibility logic.
Tables:
1. `experiment_users`
- `user_id` BIGINT -- primary key
- `variant` VARCHAR -- values: `control`, `free_month`
- `assigned_at` TIMESTAMP -- experiment assignment time in UTC
2. `subscription_events`
- `user_id` BIGINT -- foreign key to `experiment_users.user_id`
- `event_ts` TIMESTAMP -- event time in UTC
- `event_type` VARCHAR -- one of `signup`, `free_trial_start`, `paid_start`, `cancel`, `reactivate`
Business definitions:
- `eligible_users`: distinct users in `experiment_users`.
- `signups_30d`: users whose first `signup` after `assigned_at` occurs within 30 days of assignment.
- `paid_converted_60d`: users whose first `paid_start` after `assigned_at` occurs within 60 days of assignment.
- `churned_90d_after_first_paid`: among users with a first `paid_start`, count users who have at least one `cancel` event within 90 days after that first `paid_start`.
- A user may cancel, reactivate, and cancel again. Count the user at most once, using the first `cancel` after the first `paid_start`. Ignore `cancel` events that happen before the first `paid_start`.
- Use only each user's first qualifying `signup` and first qualifying `paid_start` after assignment.
Return one row per `variant` with these columns:
- `variant`
- `eligible_users`
- `signups_30d`
- `paid_converted_60d`
- `churned_90d_after_first_paid`
- `churn_rate_90d_after_first_paid` where the denominator is `paid_converted_60d`.
Quick Answer: This question evaluates proficiency in event-time analytics and cohort churn metric computation using SQL, focusing on correct identification of first qualifying events and aggregation by experimental variant.