Weekly Churn Prediction: Training/Validation/Evaluation Plan
Context
You are building a weekly churn prediction model for a streaming service with:
-
50M active users per week
-
Severe class imbalance: weekly churn ≈ 0.3%
-
Features: watch-time aggregates, recency, device, payments, limited PII
-
Data stored in a data lake
-
Compute constraints:
-
Model training must finish in < 6 hours
-
Batch inference must score 50M users in < 2 hours
Task
Design an end-to-end plan for training, validating, evaluating, and serving the weekly churn model. Your plan must cover:
-
Time-based data splits to prevent leakage (e.g., sliding-window training; validate on next week; test on the following week)
-
Handling class imbalance (negative downsampling with class-weight correction, focal loss, calibrated thresholds) and why PR-AUC/Recall@K are preferable to ROC-AUC
-
Distributed or out-of-core training options (e.g., XGBoost on Spark, sparse logistic regression) and efficient, leakage-safe hyperparameter tuning (bandit/ASHA)
-
Feature leakage audits (e.g., removing post-label signals like refund flags) and feature store versioning
-
Calibration and decisioning (Platt/Isotonic, cost-sensitive thresholds, decile stability)
-
Offline–online consistency checks and drift monitoring (PSI, KS, population stability, SHAP distribution shifts)
-
Experiment plan to validate business lift (uplift modeling or targeting thresholds) and how to size the holdout
-
How to scale inference (vectorized joins, pre-aggregation, incremental updates) and backfills for late-arriving events