This question evaluates a candidate's ability to perform data manipulation and time-series aggregation to compute monthly signups, subscription conversion rates, and year-over-year growth from user-level tables using SQL or Python (Data Manipulation (SQL/Python)).
You work at a subscription company and are given a user-level table.
company_users
id
(INT, PK) — user/customer id
signup_date
(DATE) — date the user signed up (created an account)
subscription_date
(DATE, NULL) — date the user first started a subscription (NULL if never subscribed)
termination_date
(DATE, NULL) — date the subscription ended (NULL if still active)
subscription_type
(VARCHAR) —
'free'
or
'paid'
(assume this reflects the first subscription started)
Assumptions:
subscription_date IS NOT NULL
.
signup_date
month), return:
month
(e.g.,
2017-01-01
for Jan 2017)
signups
(count of users who signed up that month)
subscribers
(count of those signups who ever subscribed)
conversion_rate
=
subscribers / signups
yoy_subscription_rate_growth
=
subscription_type='free'
). Recompute (2) but
only for paid subscriptions
(i.e., treat a user as “subscribed” only if
subscription_type='paid'
).
A monthly table with columns:
month
,
signups
,
subscribers
,
conversion_rate
,
yoy_subscription_rate_growth
(And a paid-only version for task #3.)