End-to-end ML Case: Real-time Detection of Venmo Account Takeover (ATO) at Authorization
Context
Design a real-time machine learning system that scores Venmo payment authorization events for ATO risk. The system must operate under strict latency and data constraints while dealing with delayed and noisy labels.
Requirements
-
P99 scoring latency: < 20 ms per transaction
-
Online features: Available via a low-latency feature store (e.g., Redis)
-
Labels: Confirmed ATO/chargebacks, median delay ≈ 45 days
-
Volume: 1M+ transactions per day
Tasks
A) Precisely define the positive label (ATO) and the negative set. Discuss positive–unlabeled (PU) learning and how to construct reliable training data with delayed/noisy labels.
B) Propose features across device, IP, behavior, network/graph, and account age. For at least three features, specify leakage risks and how you would time-travel-proof them.
C) Select and justify a model family (e.g., gradient boosting with monotonic constraints). Describe probability calibration (Platt vs. isotonic) and how to maintain calibration across account-age cohorts.
D) Describe an offline evaluation protocol (time-based split, label-latency handling, group-aware CV) and online validation (shadow mode, interleaving with rules).
E) Outline drift/adversary monitoring and automated retraining triggers (e.g., PSI thresholds, population/conditional shift tests).
F) Explain how to combine ML scores with deterministic rules via a policy engine to meet business constraints (block, step-up auth, allow). Show how to set per-segment thresholds to hit target FP/FN budgets.