Understand Simpson's Paradox with Simple Examples
Company: Google
Role: Data Scientist
Category: Statistics & Math
Difficulty: medium
Interview Round: Onsite
##### Scenario
You are a data scientist advising a product team on statistical analysis and experimental design.
##### Question
Explain Simpson’s paradox and illustrate it with a concrete numeric example understandable to a non-technical PM. In a forest with three bird species, propose and define metrics that quantify how segregated the species are from one another. A simple linear regression is run to estimate the effect of YouTube ad impressions on product sales. What potential statistical problems do you see and how would you redesign the study? Name ways to reduce a survey’s margin of error. If sample size and confidence level cannot change, what else could you do and why might bootstrap methods help?
##### Hints
Discuss confounding, omitted‐variable bias, experimental control, effect size, confidence intervals, and resampling logic in plain language.
Quick Answer: This question evaluates a data scientist's competency in statistical reasoning, causal inference, experimental design, quantitative metric definition, and resampling methods, touching on Simpson's paradox, species segregation metrics, confounding and omitted-variable bias, causal study design, and survey uncertainty.