Design approach for class imbalance

Q: Design approach for class imbalance

This question evaluates competency in imbalanced binary classification within machine learning, covering understanding of resampling and synthetic data techniques, cost-sensitive learning and loss functions, thresholding, cross-validation design to prevent leakage, metric selection, and hyperparameter tuning under potential dataset shift.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Imbalanced Binary Classification: Learning, Evaluation, and Model Selection

Context

You are training a binary classifier where the positive class is rare (for example, 0.1–5% prevalence). You need to choose training strategies, evaluation metrics, cross-validation structure, and tuning methods that remain reliable under severe class imbalance and potential dataset shift.

Tasks

Explain the impact of class imbalance on both learning and evaluation.
Compare strategies to handle imbalance:
- Random over-sampling and under-sampling
- Synthetic methods (e.g., SMOTE, ADASYN)
- Class weighting / cost-sensitive learning
- Focal loss
- Threshold moving (post-hoc decision thresholding)
Describe how to structure cross-validation to avoid leakage:
- Perform any resampling within each training fold only
- Use stratified folds; consider grouped or time-based splits when relevant
Recommend appropriate metrics (e.g., PR AUC, recall at fixed precision, balanced accuracy) and how to choose among them.
Outline how to tune hyperparameters under imbalance, including threshold selection.
Discuss trade-offs across variance, bias, runtime, and calibration for the above strategies.

Design approach for class imbalance

Quick Overview

Imbalanced Binary Classification: Learning, Evaluation, and Model Selection

Context

Tasks

Solution

Comments (0)