Task: Estimate ATT on 7-Day Retention Using Propensity Score Matching (PSM)
Context
You are given observational, user-level product data where users self-select into receiving a new feature. The goal is to estimate the Average Treatment Effect on the Treated (ATT) for 7-day retention.
-
Treatment (T): user received/used the feature during a defined assignment window (e.g., first 24 hours after signup or feature rollout).
-
Outcome (Y7): binary indicator of 7-day retention (active at least once during days 1–7 after cohort start; measured after treatment assignment to avoid immortal-time bias).
-
Covariates (X): pre-treatment user attributes and behaviors (e.g., country, device, signup channel, pre-period engagement, tenure, predicted activity score). Assumed measured before treatment assignment.
State clearly how you will handle:
-
Propensity modeling choices and rationale.
-
Matching strategy and tuning.
-
Balance diagnostics and remediation.
-
Lack of overlap checks and remedies.
-
Variance estimation.
-
Rosenbaum sensitivity analysis and interpretation.
Deliverables
Provide the following:
(a) Propensity model specification (logistic vs gradient boosting) and rationale.
(b) Matching strategy (1:1 nearest neighbor with/without replacement), caliper definition and computation, and how you would tune these choices.
(c) Formal balance diagnostics (standardized mean differences, variance ratios, KS tests), thresholds (e.g., SMD < 0.1), and what you do when balance fails (re-weighting/rematching).
(d) Detection of lack of overlap and remedies (e.g., trimming/restriction to common support).
(e) Variance estimation approach (Abadie–Imbens vs bootstrap) and when each is valid.
(f) Rosenbaum sensitivity analysis: report the Γ (Gamma) at which conclusions would flip and how to communicate that interpretation to a PM.