On-site Statistics Round
Task Overview
You are given a population of families that have either 1, 2, or 3 children. You sample 100 children (i.e., the sampling unit is a child, not a family). For each sampled child, you can observe the size of their family.
Answer the following:
-
Estimating family-type proportions
-
From the child sample, estimate the proportions of 1-child, 2-child, and 3-child families in the population of families (not in the population of children).
-
Construct 95% confidence intervals for those family-type proportions.
-
OLS asymmetry and causality
-
Explain why the OLS slope from Y ~ X generally differs from the slope from X ~ Y.
-
Relate this to the distinction between association and causal direction.
-
Prediction strong, coefficients insignificant
-
A regression shows all coefficients are statistically insignificant, yet the model predicts well. Provide a statistical explanation and propose fixes.
Hints: Multinomial proportions with size-bias correction and bootstrap CIs; regression asymmetry and reverse causality; multicollinearity and ridge/LASSO.