Explore Subscription Patterns and Status Transitions with SQL/Pandas
Company: Amazon
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
subscriptions
+-----------------+---------+-------------+
| subscription_id | status | status_date |
+-----------------+---------+-------------+
| 101 | active | 2023-01-05 |
| 101 | inactive| 2023-03-10 |
| 102 | inactive| 2023-02-12 |
| 102 | active | 2023-04-01 |
+-----------------+---------+-------------+
##### Scenario
Subscription analytics – product team wants to understand when customers become active or churn.
##### Question
Write an SQL query that explores column values and row patterns to confirm or deny assumptions about the structure of SUBSCRIPTIONS (e.g., uniqueness of subscription_id+status_date, allowed status transitions). In Python (pandas), build a DataFrame that returns, for every subscription_id, the first date it was ACTIVE and the last date it was INACTIVE.
##### Hints
Think window functions for SQL; in pandas use groupby with idxmin / idxmax or boolean masks.
Quick Answer: This question evaluates proficiency in time-series and event-sequence data manipulation, temporal aggregation, and data quality validation using SQL and pandas, focusing on identifying status transitions and date-based uniqueness.