Diagnose and fix underperforming ML model

Q: Diagnose and fix underperforming ML model

This question evaluates a data scientist's competency in diagnosing and remediating underperforming binary classifiers under severe class imbalance, covering validation diagnostics, calibration, threshold selection under operational review constraints, cost-sensitive utility reasoning, and basic deployment monitoring for drift.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Rapidly Improving Recall Under Class Imbalance (One-Day Plan)

Context

You inherit a binary fraud detection model with severe class imbalance (positive rate ≈ 2%). Evaluation on a temporally separated validation set shows:

ROC AUC = 0.61
Precision at 90% recall = 0.05 (very low precision at high recall, consistent with extreme imbalance)
Operations constraint: only 0.5% of traffic can be reviewed (fixed review capacity)

Goal: In one day, meaningfully improve recall at the same review capacity.

Tasks

Diagnosis: Describe how you would quickly distinguish underfitting versus overfitting using learning curves, calibration plots, PR vs ROC analysis at fixed capacity, and leakage/drift checks.
Interventions: Propose three changes you can implement in a day (e.g., class-weighted loss, monotonic gradient boosting with categorical encoders, threshold moving using cost-sensitive utility), and justify why each helps.
Thresholding for Utility: Show how to choose a decision threshold that maximizes expected utility given:

False Positive (FP) cost = $2
False Negative (FN) cost = $50
Review capacity = 0.5% of traffic Provide the utility (or cost) formula and outline the selection procedure on validation data.

Monitoring: List the minimal logging/monitoring to add at deployment to detect drift and data quality issues within a week.

Diagnose and fix underperforming ML model

Rapidly Improving Recall Under Class Imbalance (One-Day Plan)

Context

Tasks

Solution

Comments (0)

Diagnose and fix underperforming ML model

Overview

Rapidly Improving Recall Under Class Imbalance (One-Day Plan)

Context

Tasks

Solution

Comments (0)