This question evaluates applied statistical inference and causal reasoning, covering estimation of population-type proportions from a sample with size-related selection bias, construction and interpretation of confidence intervals, the asymmetry of OLS slopes under reversed regressions, and diagnosis of cases where models predict well but coefficients lack significance. Commonly asked in Statistics & Math interviews for data scientist roles, it tests both conceptual understanding (association versus causal direction) and practical application (sampling design effects and model diagnostic interpretation).
You are given a population of families that have either 1, 2, or 3 children. You sample 100 children (i.e., the sampling unit is a child, not a family). For each sampled child, you can observe the size of their family.
Answer the following:
Hints: Multinomial proportions with size-bias correction and bootstrap CIs; regression asymmetry and reverse causality; multicollinearity and ridge/LASSO.
Login required