This question evaluates competency in imbalanced binary classification within machine learning, covering understanding of resampling and synthetic data techniques, cost-sensitive learning and loss functions, thresholding, cross-validation design to prevent leakage, metric selection, and hyperparameter tuning under potential dataset shift.
You are training a binary classifier where the positive class is rare (for example, 0.1–5% prevalence). You need to choose training strategies, evaluation metrics, cross-validation structure, and tuning methods that remain reliable under severe class imbalance and potential dataset shift.
Login required