Choose ML metrics under asymmetric costs
Company: Meta
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
You own a binary classifier in production. Compare two products and choose metrics and decisions under asymmetric costs: (A) credit-card fraud detection and (B) cancer detection. 1) For each, define the business cost matrix and the metric you will optimize (e.g., expected cost, recall at fixed precision, PR AUC) and why ROC AUC may mislead; 2) show how you’d set thresholds with cost curves or iso-F1/iso-precision, and how calibration (Platt/Isotonic) affects decision-making; 3) handle extreme class imbalance (sampling, class weights, focal loss) and evaluate with stratified, time-aware CV; 4) discuss fairness and false-positive burden by segment; 5) list online metrics and logs to monitor drift and feedback loops; 6) explain how short-term metric gains could reduce long-term product engagement, and propose an experiment to measure long-term impact.
Quick Answer: This question evaluates a data scientist's competency in cost-sensitive binary classification, covering skills such as defining business cost matrices, threshold selection and probability calibration, handling extreme class imbalance, fairness assessment across segments, monitoring for drift and feedback loops, and designing experiments to measure long-term product impact. It is a machine learning domain question commonly asked to probe practical judgment about trade-offs between precision and recall, operational metrics, and product-level consequences, testing both conceptual understanding of decision theory and practical application to production ML.