ML interview: losses, metrics, class imbalance, and thresholding
Answer all parts concisely and precisely.
1) MAE vs. MSE in regression
When would you prefer MAE over MSE? Compare robustness to outliers, gradient behavior near zero, and optimization consequences. Give a concrete example where MSE underperforms while MAE is acceptable.
2) Interpreting ROC-AUC and PR-AUC under 1% prevalence
A binary classifier has 1% positive prevalence, ROC-AUC = 0.90, and PR-AUC = 0.25.
-
Which metric is more informative here and why?
-
Explain how ROC can look strong while PR remains weak. Reference score distributions that cause this.
3) Neural network binary classifier: activation, loss, and imbalance
Choose an output activation and loss. Explain how you would handle class imbalance using class weights or focal loss, and describe how each changes gradient contributions of positive vs. negative examples.
4) Threshold selection under a precision constraint
If the business requires precision ≥ 0.50, describe exactly how you would pick a probability threshold on validation data and avoid optimistic bias (e.g., nested CV or a hold-out).