Design a real-time machine learning system that scores Venmo payment authorization events for ATO risk. The system must operate under strict latency and data constraints while dealing with delayed and noisy labels.
A) Precisely define the positive label (ATO) and the negative set. Discuss positive–unlabeled (PU) learning and how to construct reliable training data with delayed/noisy labels.
B) Propose features across device, IP, behavior, network/graph, and account age. For at least three features, specify leakage risks and how you would time-travel-proof them.
C) Select and justify a model family (e.g., gradient boosting with monotonic constraints). Describe probability calibration (Platt vs. isotonic) and how to maintain calibration across account-age cohorts.
D) Describe an offline evaluation protocol (time-based split, label-latency handling, group-aware CV) and online validation (shadow mode, interleaving with rules).
E) Outline drift/adversary monitoring and automated retraining triggers (e.g., PSI thresholds, population/conditional shift tests).
F) Explain how to combine ML scores with deterministic rules via a policy engine to meet business constraints (block, step-up auth, allow). Show how to set per-segment thresholds to hit target FP/FN budgets.
Login required