Choose threshold under asymmetric costs
Company: Meta
Role: Data Scientist
Category: Machine Learning
Difficulty: Medium
Interview Round: Onsite
You own a credit-card fraud classifier deployed as a probability scorer. Choose an operating threshold under asymmetric costs and justify it quantitatively.
Assume per 1,000,000 transactions: base fraud rate = 0.20%. Costs: False Positive (decline a good transaction) = $15; False Negative (missed fraud) = $100. Consider three candidate thresholds with the following operating points on a representative validation set:
- T1: TPR = 0.90, FPR = 0.020
- T2: TPR = 0.80, FPR = 0.010
- T3: TPR = 0.65, FPR = 0.004
Tasks:
- For each threshold, compute the expected number of TP, FP, FN, TN and the expected total cost. Pick the threshold that minimizes expected cost and explain the business trade-offs.
- Explain how you would calibrate scores (e.g., Platt/Isotonic), monitor for dataset/label drift, and periodically re-tune the threshold by segment (country, merchant category, transaction amount).
- Propose guardrail metrics to detect harmful side effects (e.g., surge in declines for high-LTV users) and an experiment to validate changes without causing unacceptable customer pain.
Quick Answer: This question evaluates cost-sensitive classification, operating-threshold selection, score calibration, drift monitoring, and experiment and guardrail design competencies within the Machine Learning domain.