Monthly 30-Day Retention Cohort (PostgreSQL)
Context
You are given a table of companies with signup and (optional) termination dates. Define monthly cohorts by the signup month and compute 30-day retention per cohort. Use a fixed as_of_date to handle right-censoring.
Retention rule: a company is retained at day 30 if terminationdate is NULL or terminationdate ≥ signupdate + 30 days. Churn strictly before day 30 is not retained; churn on day 30 counts as retained.
Right-censoring: exclude any signup whose 30-day window has not fully elapsed by as_of_date (i.e., where signupdate + 30 days > as_of_date).
Do not exclude Free_Subs from the base metric; if you want paid-only retention, add an additional column that excludes any companyid present in Free_Subs.
Schema and Sample Data
Table: company
-
companyid (INT)
-
signupdate (DATE)
-
subscriptiondate (DATE)
-
terminationdate (DATE)
Sample rows:
-
(101, 2019-06-05, 2019-06-06, 2019-06-25) — churn < 30d (not retained)
-
(102, 2019-06-10, 2019-06-15, NULL) — retained
-
(103, 2019-06-20, NULL, NULL) — retained (no termination)
-
(104, 2019-07-01, 2019-07-02, 2019-07-31) — churn on day 30 (retained)
-
(105, 2019-07-15, 2019-07-16, 2019-07-10) — termination before signup (data quality)
-
(106, 2025-08-20, 2025-08-21, NULL) — censored as of 2025-09-01; exclude from 30d calculation
-
(107, 2020-01-01, 2020-01-02, 2020-02-15) — retained
-
(108, 2020-01-10, NULL, 2020-01-25) — churn < 30d (not retained)
Optional table for paid-only metric:
Task
Produce a monthly cohort table of 30-day retention based on signupdate.
-
cohort_month = date_trunc('month', signupdate)
-
cohort_size = number of companies in the cohort (after right-censoring and any data-quality handling)
-
d30_retained = number of companies retained at day 30
-
d30_retention_rate = d30_retained / cohort_size (3 decimals)
Parameters and rules:
-
A company is retained at day 30 if terminationdate is NULL or terminationdate ≥ signupdate + INTERVAL '30 days'.
-
Handle right-censoring: exclude signups where signupdate + INTERVAL '30 days' > as_of_date. Use as_of_date = '2025-09-01'.
-
Return columns:
-
cohort_month (DATE, first day of month)
-
cohort_size (INT)
-
d30_retained (INT)
-
d30_retention_rate (NUMERIC, 3 decimals)
-
Do not exclude Free_Subs in the base metric. Optionally add a paid-only column paid_d30_retention_rate that excludes companyids present in Free_Subs.
-
In comments, explain how to handle data-quality anomalies (e.g., terminationdate < signupdate).