Design a fraud mitigation strategy under constraints
Company: PayPal
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
You are given a one-page case during a hiring manager round for a Fraud Data Scientist role.
Current state:
- The existing fraud model is performing poorly: only ~40% of fraud is being intercepted (assume “intercepted” means blocked or prevented before loss).
- A large share of fraud appears to originate from “emerging regions.”
- You have limited resources (limited engineering time, limited manual review capacity, and limited ability to deploy many new data sources quickly).
- If you tune the system to a very low precision (e.g., ~2% precision), customer complaints will increase materially.
Task:
1) Clarify what questions you would ask to properly define the problem (labels, loss definition, action space, constraints).
2) Propose a concrete, phased fraud strategy to reduce fraud loss while controlling customer harm. Include segmentation, thresholding, and what actions you would take at different risk levels (block vs step-up vs review).
3) Describe how you would quantify tradeoffs and decide on operating points (thresholds/policies). Include a simple expected-value framework.
4) Explain how you would measure success after launch and how you would guard against biased evaluation due to label delay and selective outcomes (e.g., blocked transactions don’t charge back).
You may assume you have access to historical transaction outcomes, disputes/chargebacks, basic device/IP signals, and region information.
Quick Answer: This interview prompt evaluates a Data Scientist's skills in fraud risk modeling, constrained resource prioritization, segmentation and thresholding, and experimental evaluation including handling label delay and selection bias.