Design and Analyze A/B Test for Cashback Program
A/B Test Design: Checkout Cashback Program (PayPal)
Scenario
PayPal plans to launch a checkout cashback program (e.g., "Get 1–5% back when you pay with PayPal"). The goal is to evaluate whether offering cashback at checkout improves key business outcomes while remaining cost-effective and safe.
Task
Design, run, and analyze an A/B test to evaluate the cashback program.
Please cover
-
Hypothesis and success criteria.
-
Experiment design:
-
Population and eligibility
-
Unit of randomization and variants
-
Exposure and assignment rules
-
Sample size and test duration
-
Metrics:
-
Primary metric(s)
-
Guardrail/safety metrics
-
Secondary/diagnostic metrics
-
Data needed and instrumentation.
-
Analysis plan and statistical methods.
-
Segmentation and heterogeneity of effects.
-
Risks, biases, and mitigations (e.g., fraud, interference, novelty).
You may assume the feature is shown at checkout to eligible users and credits cashback after successful payment.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify the business objective, unit of analysis, time window, exposure definition, and primary metric.
-
State assumptions about instrumentation, randomization, sample size, and data quality.
-
Separate descriptive analysis from causal claims.
What a Strong Answer Covers
-
A metric framework with primary, guardrail, and diagnostic metrics.
-
A credible analysis or experiment design with clear assumptions and bias checks.
-
SQL/statistical logic for segmentation, variance, confidence, and data validation where relevant.
-
An actionable recommendation that explains trade-offs and next steps.
Follow-up Questions
-
What sanity checks would you run before trusting the result?
-
How would you handle novelty effects, seasonality, or selection bias?
-
What decision would you make if metrics disagree?