Real-Time Fraud Detection with XGBoost (Subscription Payments)
Scenario
You need to build and operate a real-time system that flags potentially fraudulent subscription-payment transactions with sub-second latency. Historical labels come from chargebacks/refunds with a delay of weeks. Data includes transaction attributes, user/account metadata, device/network signals, and historical behavior.
Task
Outline the end-to-end approach, covering:
-
End-to-end workflow
-
Data ingestion, labeling, feature engineering (batch + streaming), training/validation protocol, hyperparameter tuning, offline–online feature parity, deployment architecture, and a feedback loop.
-
Evaluation metrics
-
Which metrics you would prioritize in an imbalanced, high-stakes setting and why.
-
Handling severe class imbalance
-
Approaches such as class weighting, sampling, threshold tuning, and any loss/metric choices.
-
Monitoring for model drift post-deployment
-
Describe one concrete strategy to detect and respond to drift.