Design and analyze a card signup A/B test
Company: Capital One
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
A bank is launching a co-branded gym credit card. You can show a 3‑month free gym membership offer on the application landing page (Variant B) vs no offer (Control A). Traffic ≈ 200,000 sessions/day; baseline apply→approval rate = 8%; average initial credit line = $1,000; 90‑day charge‑off rate = 1.2% with 60% loss severity; average 90‑day revenue per approved account (interest + interchange) = $120; acquisition bonus and onboarding costs per approved = $40. Risk requires: predicted default probability (PD) of approved pool must not increase by >10% relative to control. You will run a 14‑day 50/50 test. Design and analyze this experiment: 1) Define a single primary metric as risk‑adjusted profit per visitor (RAPV) and write its exact formula from the inputs above; specify at least two guardrail metrics (with thresholds) for risk/fraud/compliance. 2) Convert the 5% relative lift hypothesis on approval rate into an expected lift on RAPV; state assumptions needed to avoid Simpson’s paradox across traffic sources and day‑of‑week. 3) Compute or set up the required sample size per variant for 80% power at α=0.05 to detect the expected RAPV lift; justify variance estimation (delta method vs bootstrap) and whether you’ll use CUPED or pre‑period covariates. 4) Specify randomization unit and identity resolution to prevent contamination (cookies, logged‑in IDs, device graph) and how you’ll treat repeat applicants, bots, and duplicate identities. 5) Detail your sequential testing plan (e.g., group‑sequential or alpha spending) to allow interim safety stops without inflating Type I error; define exact stop/go/ramp criteria. 6) Show how you will monitor risk mix shift (e.g., PD by score bands) and enforce the 10% PD guardrail while avoiding conditioning on post‑treatment variables; propose a stratified analysis and a heterogeneity readout by acquisition channel and geography. 7) Outline data quality checks (event schema, missingness, timeouts), instrumentation events, and backfill/late arrival handling. 8) If Variant B increases approvals by 6% but raises PD by 9% and reduces average credit line by 5%, decide whether to ship, using a 12‑month NPV sensitivity (state discount rate and churn assumptions). 9) List two follow‑on experiments to isolate mechanism (e.g., offer placement vs wording), and one off‑policy evaluation you’d run using historical scores.
Quick Answer: This question evaluates a data scientist's competency in A/B test design, causal inference, statistical power and sample-size calculation, risk-adjusted profit metrics, sequential testing, identity resolution, and monitoring for credit and fraud impacts.