Interpret p-values and common pitfalls
Company: PayPal
Role: Data Scientist
Category: Statistics & Math
Difficulty: hard
Interview Round: Technical Screen
In a Fraud Data Science interview, you are asked “some p-value questions.”
Answer the following in a fraud/experimentation context:
1) Define a p-value precisely. What does it mean, and what does it NOT mean?
2) If you run an A/B test on a new friction step (e.g., extra OTP) and get p=0.03, what conclusions can you draw? What additional information do you need (effect size, power, business impact)?
3) Describe at least 4 common pitfalls when using p-values in practice (multiple testing, p-hacking, peeking, nonstationarity, selection bias).
4) Explain how you would adjust your analysis if:
- You are testing many regions/segments.
- Outcomes are rare and delayed (e.g., chargebacks arriving weeks later).
- Randomization is imperfect or there is interference/spillover.
Provide at least one concrete method for each scenario (e.g., Bonferroni/FDR, sequential testing, CUPED, Bayesian, cluster-robust SEs).
Quick Answer: Evaluates interpretation of p-values, hypothesis testing, experimental design and statistical pitfalls in a fraud/experimentation context—including concepts like effect size, statistical power, multiple comparisons, delayed/rare outcomes, and imperfect randomization; category/domain: Statistics & Math.