Retention and Cohort Metrics
Asked of: Data Scientist
Last updated

-
What it is Retention measures how many users return and perform a key action after first using a product. Cohort metrics group users by a shared “start” (e.g., signup month) and track their retention over equal time offsets to reveal patterns hidden in aggregates.
-
Why interviewers ask about it Data Scientists at Meta-style companies use retention and cohorts to diagnose onboarding, ranking, and feature changes, and to separate product health from top-line growth. These metrics power experiment readouts, LTV forecasts, and prioritization by showing where and for whom engagement sticks.
-
Core ideas to know
- A cohort groups users by a shared start event (e.g., first session, first purchase) to compare behavior over time.
- Retention variants: Day-N (fixed), rolling, and windowed/bracketed. Always define the “active” event precisely.
- Cohort tables and retention curves show decay and steady-state flattening; improving early weeks often lifts the whole curve.
- Control confounders: acquisition channel mix, seasonality, and product/version changes; segment cohorts accordingly.
- Survival analysis (Kaplan–Meier, hazard rates) handles right-censoring and estimates time-to-churn robustly.
- Connect to business: cohort retention drives LTV, payback, and NRR/logo retention; improvements compound across months.
-
A common pitfall Candidates quote a single retention rate without crisp definitions. Interviewers expect clarity on the cohort start event, the active event, and the time window (calendar days vs 24‑hour windows). Mixing acquisition sources or product versions within a cohort, or counting reactivations as retained, skews curves and leads to wrong decisions. Good answers state assumptions, show segmented cohorts, and mention checks for instrumentation gaps and right-censoring.
-
Further reading
- Amplitude — What Is Cohort Retention Analysis: Essential Metrics Guide (clear definitions, cohort tables, and common pitfalls in product analytics) https://amplitude.com/explore/analytics/cohort-retention-analysis
- Mixpanel — Ultimate guide to cohort analysis (practical walkthrough of cohort construction, retention curves, and interpretation) https://mixpanel.com/blog/what-is-cohort-analytics/
- Journal of Marketing Analytics — Predictability and explainability of survival analysis in churn prediction (academic grounding for using survival methods on retention/churn) https://link.springer.com/article/10.1057/s41270-025-00450-2