Experiment Design Case: Real-time ATO Rule for PayPal/Venmo
Context: You are designing and analyzing an online experiment to estimate the net business impact of a new real-time ATO rule that blocks high-risk transfers. The rule reduces successful ATO fraud at the potential cost of blocking some legitimate transfers.
Inputs and constraints:
-
Randomization unit must avoid cross-over/interference. Choose user_id vs transaction-level; justify with interference risks (recipients can be common).
-
Baselines:
-
Weekly fraud base rate on transfers p0 = 0.0012
-
Average fraudulent loss per incident L_f = $200
-
Expected relative fraud reduction under treatment = 20%
-
Legitimate block cost C_fp = $1.50 per blocked legitimate transfer
-
Expected block rate under treatment = 1.0% of legitimate transfers
-
Traffic and clustering:
-
10M transfers/week
-
Average 5 transfers per active user/week
-
ICC (clustered at user) = 0.02
-
Average cluster size m = 5
-
Guardrails: authentication success rate, dispute rate within 7 days, P95 time-to-pay
-
Stats: two-sided alpha = 0.05, power = 0.80; allow daily sequential monitoring with alpha spending; require pre-registration and an A/A test
Tasks:
A) Choose the randomization unit and explain spillover/contamination mitigations (e.g., recipient or graph clusters).
B) Compute the minimum per-arm sample size (in transfers) to detect a 20% relative drop in fraud rate using a two-sample proportion Z-test; then inflate by the design effect DE = 1 + (m−1)·ICC. Show formulas and numeric results.
C) Convert the detectable effect into expected weekly net dollars using:
Net = (Fraud prevented × L_f) − (Incremental legitimate blocks × C_fp).
State additional assumptions and bound the estimate.
D) Define primary, secondary, and guardrail metrics with precise denominators; specify key slicing (e.g., by device novelty, account age).
E) Propose a ramp plan and stopping rules under a group sequential design (e.g., Pocock or O'Brien–Fleming), and how you’ll monitor production for post-launch drift.