Cost-sensitive Thresholding for Fraud (ATO) Classifier
Context
You are evaluating a binary classifier for account takeover (ATO) fraud on a large validation set. The model outputs a score; you can choose a decision threshold. Fraud prevalence initially is 0.2%. Costs are asymmetric: a false positive (blocking a legitimate transfer) costs 2,whileafalsenegative(lettingafraudthrough)costs120. Assume 1,000,000 evaluated transfers.
The validation ROC operating points are:
| Threshold | TPR | FPR |
|---|
| 0.90 | 0.50 | 0.0010 |
| 0.80 | 0.65 | 0.0030 |
| 0.70 | 0.75 | 0.0060 |
| 0.60 | 0.82 | 0.0100 |
| 0.50 | 0.88 | 0.0180 |
Assume these TPR/FPR values are stable when prevalence shifts (ROC is prevalence-invariant) and that correct classifications have zero cost.
Tasks
A) At prevalence π = 0.2% (0.002), compute for each threshold the expected total cost over 1,000,000 transfers:
-
Total cost = (FP ×
2)+(FN×
120)
Then choose the cost-minimizing threshold and report PPV and NPV at that point.
B) If prevalence drops to π = 0.1% (0.001) due to seasonality, recompute the expected costs and discuss whether the optimal threshold changes.
C) Using ROC theory, derive the cost-optimal slope
-
λ = (C_fp / C_fn) × ((1 − π) / π)
Explain how this slope maps to choosing a point on the ROC curve (i.e., the tangent condition on the ROC/ROC convex hull), and interpret how prevalence shifts move the optimal operating point without retraining.