Binary Purchase Prediction with Delayed Labels and Imbalanced Classes
Context
-
Goal: Ship a real-time binary classifier that predicts whether a user will purchase within the next 7 days.
-
Class imbalance: Positives ≈ 3%.
-
Label delay: Up to 10 days after the 7-day window (i.e., labels mature up to 17 days after the prediction time).
-
Features: Recent session statistics, counts, and recency by category.
-
Business values (per user):
-
True Positive (TP): +$2 expected margin
-
False Positive (FP): −$0.10 (annoyance/discount cost)
-
False Negative (FN): −$0.50 (missed margin)
-
True Negative (TN): $0
Tasks
A) Propose an end-to-end training and evaluation design that avoids leakage under delayed labels. Specify an exact time-based cross-validation scheme (fold boundaries, feature and label windows) and explain why it’s unbiased.
B) Choose offline metrics and describe how to calibrate the model (e.g., Platt scaling or isotonic regression). Provide the formula for selecting the decision threshold that maximizes expected profit under the given costs, and explain how you would assess threshold stability across cohorts.
C) Handle distribution shift: outline drift detection on covariates and on calibration (e.g., PSI, ECE). Propose an online monitoring dashboard with guardrails.
D) Latency and interpretability: With a 50 ms p95 budget and 64 MB RAM per request, describe a deployable modeling choice and featurization plan (including any precomputed features) that meets constraints, plus a fallback rule when the model is unavailable.
E) Explain the model and threshold decisions to a non-technical stakeholder and reconcile if they insist on a different threshold. What evidence would you present to align on the target operating point?