Scenario
You are a data scientist advising a product team on statistical analysis and experimental design.
Tasks
-
Simpson’s paradox
-
Explain Simpson’s paradox in plain language and give a small numeric example that a non-technical PM can follow.
-
Species segregation in a forest
-
In a forest with three bird species, propose and define clear, quantitative metrics that capture how segregated the species are from one another. State the minimal assumptions (e.g., grid cells with presence/absence or point locations of birds) and how to interpret each metric.
-
Ads and sales: regression vs. causality
-
A simple linear regression is run to estimate the effect of YouTube ad impressions on product sales. What statistical problems might arise? How would you redesign the study to estimate causal lift? Be explicit about confounding, omitted‐variable bias, experimental control, effect size, and confidence intervals.
-
Survey margin of error
-
Name ways to reduce a survey’s margin of error (MoE). If sample size and confidence level cannot change, what else could you do, and why might bootstrap methods help?
Hints
-
Touch on confounders, omitted‐variable bias, experimental control, effect sizes, confidence intervals, and resampling (bootstrap) logic in plain language.