Design Schema for Accurate Subscription State Tracking
Company: OpenAI
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Take-home Project
subscription_events
+----------+---------------------+-----------+-----------+
| user_id | event_ts | event_type| plan_type |
+----------+---------------------+-----------+-----------+
| 1001 | 2024-04-01 08:20:00 | signup | free |
| 1001 | 2024-04-15 10:05:00 | cancel | free |
| 1001 | 2024-04-20 11:30:00 | signup | paid_trial|
| 1002 | 2024-04-02 09:00:00 | signup | free |
| 1003 | 2024-04-03 12:10:00 | signup | paid_trial|
+----------+---------------------+-----------+-----------+
##### Scenario
Design the raw event table that feeds the experiment metrics, handling edge cases where a user can signup, cancel, then signup again.
##### Question
Propose an event-level schema that supports accurate daily subscription status reconstruction. Write SQL or Python logic that derives each user’s subscription state for any date, correctly accounting for multiple signup-cancel cycles.
##### Hints
Event sourcing + window/aggregation; last event before snapshot determines state.
Quick Answer: This question evaluates proficiency in temporal schema design and event-level data modeling, focusing on reconstructing time-varying subscription state using SQL and Python in the Data Manipulation (SQL/Python) domain.