Optimize Churn Prediction: Feature Engineering and Model Selection
Weekly Churn Prediction (10M users): Feature Engineering, Model Choice, Explainability, and Debugging
Scenario
You own a weekly churn-prediction pipeline that trains on 10 million users. The goal is to predict who will churn so the business can target retention interventions.
Tasks
-
Feature Engineering
-
Define the label, observation/prediction windows, and leakage controls.
-
Propose key feature families and how to handle imbalance.
-
Model Selection and Hyper-parameter Tuning
-
Describe the model development process, evaluation, and tuning strategy at this scale.
-
Model Choice Rationale
-
Why might you favor Gradient Boosted Trees (GBTs) over Logistic Regression (LR) here?
-
Explainability
-
Describe two techniques for explaining model outputs to non-technical stakeholders.
-
Production Debugging
-
If recall drops by 15% week-over-week, provide a step-by-step debugging checklist.
Hints: Discuss imbalance handling, SHAP, feature drift, and offline/online parity checks.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify the task, data shape, labels, constraints, and evaluation metric.
-
State assumptions behind the math or modeling technique you choose.
-
Connect theory to practical training, debugging, and deployment implications.
What a Strong Answer Covers
-
Correct definitions and formulas where the prompt requires them.
-
A practical explanation of how the method behaves on real data.
-
Trade-offs, failure modes, diagnostics, and mitigation strategies.
-
Evaluation choices that match the product or modeling objective.
Follow-up Questions
-
How would noisy labels, class imbalance, or distribution shift affect the answer?
-
What would you monitor after deployment?
-
Which baseline would you compare against first?