Design a robust fraud detection system
Company: Capital One
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Onsite
You’re tasked with building a real-time fraud detector for card transactions.
Context:
- Class imbalance: fraud rate ≈ 0.2%.
- Labels arrive with 14-day delay (chargebacks/confirmed fraud).
- Latency SLO: p95 inference < 50 ms; throughput 2k TPS.
- Cost matrix (per decision): FP = $5 (lost conversion + manual review), FN = $200 (average fraud loss after recovery).
Tasks:
1) Data/labeling: Describe how you would construct time-aware train/validation/test splits and avoid leakage from post-transaction outcomes (e.g., chargeback windows, reversals). Specify a concrete split scheme and rationale.
2) Features: Propose 10+ robust features (e.g., velocity, device/merchant risk, graph features). Explain handling of high-cardinality categoricals and target leakage pitfalls. How would you implement feature freshness guarantees?
3) Modeling: Compare supervised (e.g., XGBoost, calibrated deep nets) vs anomaly detection (e.g., Isolation Forest) given sparse positives. When would you hybridize?
4) Evaluation: Choose metrics and justify (PR AUC vs ROC AUC vs expected cost). Design a thresholding procedure that maximizes expected profit under the given cost matrix. Show the exact optimization objective and how you’d calibrate probabilities.
5) Drift/monitoring: Define concrete drift and performance monitors (populations, PSI/JS, calibration, cost per transaction). How would you operate in the 14-day label delay period?
6) Online rollout: Propose a safe shadow/holdback plan and guardrails to cap business risk (e.g., block-rate ceilings, human-in-the-loop). How do you reconcile offline metrics with online KPIs?
7) Adversarial behavior: Describe 3 defenses against adaptive fraudsters (e.g., randomization, ensembling with behavior-based models, canary features) and how you’d validate they work.
Quick Answer: This question evaluates a candidate's competency in end-to-end machine learning system design for real-time fraud detection, covering time-aware data splitting, feature engineering for high-cardinality and severely imbalanced classes, model selection under latency and cost constraints, calibration and thresholding, monitoring during delayed-label periods, safe online rollout, and adversarial defenses. It is commonly asked to assess the ability to balance statistical trade-offs and production engineering requirements in the Machine Learning domain, emphasizing practical application-level system design that also requires conceptual understanding of delayed labels, cost-sensitive evaluation, and operational monitoring.