This question evaluates a data scientist's competency in model evaluation for imbalanced binary classification, covering understanding of precision–recall dynamics, calibration, AUPRC, cost‑sensitive metrics, and practical implementation concerns like computing thresholds and handling ties and numerical edge cases; domain: Coding & Algorithms and machine learning model evaluation. It is commonly asked in technical interviews because it assesses both conceptual understanding of evaluation trade‑offs and the practical application of efficient, robust metric computation and threshold selection under asymmetric costs and operational constraints.
You receive a CSV with columns: actual_label ∈ {0,1} and predicted_prob ∈ [0,1]; the positive class rate is ≈5%. a) Which evaluation metrics would you prioritize and why (PR curve/AUPRC, calibration, cost‑sensitive metrics; pitfalls of accuracy/ROC in heavy imbalance)? b) Write a Python function that returns thresholds, precision, recall, and F1 across all unique predicted_prob values; handle ties, empty denominators, and enforce monotonic precision if needed. c) Compute AUPRC efficiently and discuss the effect of score calibration on the PR curve. d) Describe how you would pick an operating threshold given asymmetric costs and volume constraints.