You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01.
Deliverables
-
Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31.
-
Online experiment plan for a policy/heuristic launch.
Requirements
-
Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals.
-
Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds.
-
Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods.
-
Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty.
-
Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them.
-
Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria.
-
Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects.
-
Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.