Task: Monthly Cohorted Retention (30-day, then 30/60/90-day curve)
You are given a single table, company, with columns:
-
company_id (INT)
-
signup_date (DATE)
-
subscription_date (DATE, nullable)
-
termination_date (DATE, nullable)
Compute monthly cohorted retention based on signup_date. Define cohort_month = DATE_TRUNC('month', signup_date).
Assume data_cutoff_date is a parameter representing the latest reliable observation date. If not provided, default it to the maximum observed date across signup_date, termination_date, subscription_date:
-
data_cutoff_date = GREATEST(MAX(signup_date), MAX(termination_date), MAX(subscription_date))
Right-censoring rule: include only cohorts for which the retention window is fully observable at the cohort level. Specifically, exclude any cohort where cohort_month + INTERVAL '30 days' > data_cutoff_date.
A company is considered retained at day D if:
-
termination_date IS NULL, OR
-
termination_date > signup_date + INTERVAL 'D days'
Edge cases to handle:
-
termination_date exactly equals signup_date + 30 days is NOT retained.
-
NULL termination_date implies retained at any threshold (subject to censoring rule).
-
Companies without subscription_date are still included in cohorts and evaluated for retention.
-
Present rates as retained_xxd / cohort_size with safe division and 0.0 when cohort_size = 0.
Return for the 30-day retention:
-
cohort_month (DATE)
-
cohort_size (INT)
-
retained_30d (INT)
-
retention_30d_rate (DECIMAL, 4 decimals)
Then generalize to a monthly retention curve by adding 60 and 90 days using a generate_series of thresholds (30, 60, 90) and pivoting/widening the results into:
-
retained_30d, retained_60d, retained_90d (INT)
-
retention_30d_rate, retention_60d_rate, retention_90d_rate (DECIMAL, 4 decimals)
Table (sample):
company
+------------+-------------+-------------------+------------------+
| company_id | signup_date | subscription_date | termination_date |
+------------+-------------+-------------------+------------------+
| 1 | 2019-05-20 | 2019-06-02 | 2020-01-10 |
| 2 | 2019-06-15 | 2019-06-20 | NULL |
| 3 | 2019-07-01 | 2019-07-05 | 2019-12-31 |
| 4 | 2020-06-10 | 2020-06-11 | NULL |
| 5 | 2020-06-30 | 2020-07-02 | NULL |
| 6 | 2020-07-15 | NULL | NULL |
+------------+-------------+-------------------+------------------+