This question evaluates competence in production-ready supervised machine learning—detecting and repairing data leakage in logs-based features, designing time-aware cross-validation, defining KPI-driven thresholding and calibration, interpreting model coefficients, and establishing monitoring and retraining policies for a churn prediction model in the Machine Learning domain. It is commonly asked because interviewers need to assess both conceptual understanding of leakage, validation, and calibration principles and practical application skills for deploying and maintaining models with cost-sensitive objectives and monitoring in production, so the level of abstraction spans conceptual reasoning and hands-on practical application.

Context: You inherit a weekly-scored model that predicts whether a user will place an order in the next 28 days. Some features were built from logs in ways that leak information from the post-prediction label window. Address the following tasks.
(a) Leakage identification and repair
(b) Time-based cross-validation (rolling origin)
(c) KPIs, thresholding, and calibration
(d) Interpretation
(e) Monitoring and retraining
Login required