Debug a failing ML classifier

Q: Debug a failing ML classifier

This is a Machine Learning interview question from OpenAI for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Debugging a Churn Prediction Pipeline With Poor Generalization

Context

You are evaluating a binary churn prediction system with:

Training ROC AUC: 0.95
Validation ROC AUC (random split): 0.62
Time-based holdout ROC AUC (most recent month): 0.55
Predicted probabilities are overconfident
Positive class prevalence ≈ 1:10 (about 9–10% positive)

Assume the goal is to predict whether a customer will churn in the next period using only information available up to an "as-of" cutoff time.

Task

Describe, step-by-step, how you would debug this system. Cover:

Data validation and leakage checks (including temporal leakage)
Label and feature drift analysis
Cross-validation scheme selection
Error analysis (by slices, calibration, threshold-dependent confusion matrices)
Ablations and feature audits
Training issues (regularization, class weighting, resampling)

Propose concrete experiments to isolate root causes, the metrics you would inspect, and recommend fixes plus a plan to verify improvements and prevent regressions (tests, data versioning, monitoring).

Debug a failing ML classifier

Debugging a Churn Prediction Pipeline With Poor Generalization

Context

Task

Solution (Locked)

Comments (0)