Predicting Purchase Propensity After a Campaign (5% Positives)
You previously ran a marketing campaign to 10,000 customers and observed 500 purchases (5% positive rate). You now want to build a model to score customers for the next campaign so you can target those most likely to purchase under a fixed budget.
Design an end-to-end approach that includes:
-
Baseline Model and Features
-
Start with logistic regression. Describe:
-
Feature engineering: numeric handling, categorical encoding, scaling, missing values, interactions.
-
Data splitting strategy (temporal/stratified), pipelines, and prevention of leakage.
-
Class Imbalance
-
Compare class_weight, random over/under-sampling, and SMOTE. State which metric(s) you’ll optimize and why (e.g., PR-AUC, recall at fixed precision, cost-sensitive loss).
-
Thresholding, Calibration, and Budgeted Targeting
-
Explain ranking vs. classification thresholds, probability calibration (Platt scaling or isotonic), and how to choose top-N customers to target under a fixed budget.
-
Validation and Monitoring
-
Describe stratified cross-validation, how you will report confidence intervals, and how you will monitor post-deployment drift and business lift.
-
Feature Selection and Interpretability
-
List at least two feature selection methods (e.g., L1 penalty, mutual information, recursive feature elimination) and how you would guard against overfitting while preserving interpretability.