You are given a one-page case during a hiring manager round for a Fraud Data Scientist role.
Current state:
-
The existing fraud model is performing poorly: only ~40% of fraud is being intercepted (assume “intercepted” means blocked or prevented before loss).
-
A large share of fraud appears to originate from “emerging regions.”
-
You have limited resources (limited engineering time, limited manual review capacity, and limited ability to deploy many new data sources quickly).
-
If you tune the system to a very low precision (e.g., ~2% precision), customer complaints will increase materially.
Task:
-
Clarify what questions you would ask to properly define the problem (labels, loss definition, action space, constraints).
-
Propose a concrete, phased fraud strategy to reduce fraud loss while controlling customer harm. Include segmentation, thresholding, and what actions you would take at different risk levels (block vs step-up vs review).
-
Describe how you would quantify tradeoffs and decide on operating points (thresholds/policies). Include a simple expected-value framework.
-
Explain how you would measure success after launch and how you would guard against biased evaluation due to label delay and selective outcomes (e.g., blocked transactions don’t charge back).
You may assume you have access to historical transaction outcomes, disputes/chargebacks, basic device/IP signals, and region information.