How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

What difficulty level is this interview question?

This is a medium difficulty Statistics & Math question, commonly asked during Technical Screen rounds at PayPal.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at PayPal during technical interviews.

Optimize thresholds under fraud costs | PayPal Interview Question

Quick Overview

This English summary evaluates a candidate's competency in cost-sensitive binary classification, threshold selection, and ROC-based operating-point analysis, including calculating expected costs and predictive values under varying prevalence and asymmetric error costs.

Cost-sensitive Thresholding for Fraud (ATO) Classifier

Context

You are evaluating a binary classifier for account takeover (ATO) fraud on a large validation set. The model outputs a score; you can choose a decision threshold. Fraud prevalence initially is 0.2%. Costs are asymmetric: a false positive (blocking a legitimate transfer) costs $2, while a false negative (letting a fraud through) costs$ 120. Assume 1,000,000 evaluated transfers.

The validation ROC operating points are:

Threshold	TPR	FPR
0.90	0.50	0.0010
0.80	0.65	0.0030
0.70	0.75	0.0060
0.60	0.82	0.0100
0.50	0.88	0.0180

Assume these TPR/FPR values are stable when prevalence shifts (ROC is prevalence-invariant) and that correct classifications have zero cost.

Tasks

A) At prevalence π = 0.2% (0.002), compute for each threshold the expected total cost over 1,000,000 transfers:

Total cost = (FP × $2) + (FN ×$ 120) Then choose the cost-minimizing threshold and report PPV and NPV at that point.

B) If prevalence drops to π = 0.1% (0.001) due to seasonality, recompute the expected costs and discuss whether the optimal threshold changes.

C) Using ROC theory, derive the cost-optimal slope

λ = (C_fp / C_fn) × ((1 − π) / π) Explain how this slope maps to choosing a point on the ROC curve (i.e., the tangent condition on the ROC/ROC convex hull), and interpret how prevalence shifts move the optimal operating point without retraining.

Quick Overview

Context

2, while a false negative (letting a fraud through) costs

120. Assume 1,000,000 evaluated transfers.

The validation ROC operating points are:

Threshold

TPR

FPR

0.90

0.50

0.0010

0.80

0.65

0.0030

0.70

0.75

0.0060

0.60

0.82

0.0100

0.50

0.88

0.0180

Assume these TPR/FPR values are stable when prevalence shifts (ROC is prevalence-invariant) and that correct classifications have zero cost.

Tasks

A) At prevalence π = 0.2% (0.002), compute for each threshold the expected total cost over 1,000,000 transfers:

Total cost = (FP ×

2) + (FN ×

120) Then choose the cost-minimizing threshold and report PPV and NPV at that point.

B) If prevalence drops to π = 0.1% (0.001) due to seasonality, recompute the expected costs and discuss whether the optimal threshold changes.

C) Using ROC theory, derive the cost-optimal slope

λ = (C_fp / C_fn) × ((1 − π) / π) Explain how this slope maps to choosing a point on the ROC curve (i.e., the tangent condition on the ROC/ROC convex hull), and interpret how prevalence shifts move the optimal operating point without retraining.

Optimize thresholds under fraud costs

Quick Overview

Cost-sensitive Thresholding for Fraud (ATO) Classifier

Context

Tasks

Solution

Comments (0)

Optimize thresholds under fraud costs

Quick Overview

Cost-sensitive Thresholding for Fraud (ATO) Classifier

Context

Tasks

Solution

Comments (0)