Weekly Churn Prediction (10M users): Feature Engineering, Model Choice, Explainability, and Debugging
Scenario
You own a weekly churn-prediction pipeline that trains on 10 million users. The goal is to predict who will churn so the business can target retention interventions.
Tasks
-
Feature Engineering
-
Define the label, observation/prediction windows, and leakage controls.
-
Propose key feature families and how to handle imbalance.
-
Model Selection and Hyper-parameter Tuning
-
Describe the model development process, evaluation, and tuning strategy at this scale.
-
Model Choice Rationale
-
Why might you favor Gradient Boosted Trees (GBTs) over Logistic Regression (LR) here?
-
Explainability
-
Describe two techniques for explaining model outputs to non-technical stakeholders.
-
Production Debugging
-
If recall drops by 15% week-over-week, provide a step-by-step debugging checklist.
Hints: Discuss imbalance handling, SHAP, feature drift, and offline/online parity checks.