PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Statistics & Math/Upstart

Estimate Family Proportions and Explain Regression Anomalies

Last updated: Mar 29, 2026

Quick Overview

This question evaluates applied statistical inference and causal reasoning, covering estimation of population-type proportions from a sample with size-related selection bias, construction and interpretation of confidence intervals, the asymmetry of OLS slopes under reversed regressions, and diagnosis of cases where models predict well but coefficients lack significance. Commonly asked in Statistics & Math interviews for data scientist roles, it tests both conceptual understanding (association versus causal direction) and practical application (sampling design effects and model diagnostic interpretation).

  • medium
  • Upstart
  • Statistics & Math
  • Data Scientist

Estimate Family Proportions and Explain Regression Anomalies

Company: Upstart

Role: Data Scientist

Category: Statistics & Math

Difficulty: medium

Interview Round: Onsite

##### Scenario On-site statistics round ##### Question Population contains one-, two- and three-child families. Estimate the proportion of each family type from a sample of 100 children. Construct a 95 % confidence interval. Explain why the OLS coefficient of Y~X differs from X~Y and relate to causal direction. A regression model shows all coefficients statistically insignificant yet high predictive performance. Provide a statistical explanation and propose a fix. ##### Hints Multinomial proportions, bootstrap CI; reverse causality; multicollinearity and LASSO/ridge.

Quick Answer: This question evaluates applied statistical inference and causal reasoning, covering estimation of population-type proportions from a sample with size-related selection bias, construction and interpretation of confidence intervals, the asymmetry of OLS slopes under reversed regressions, and diagnosis of cases where models predict well but coefficients lack significance. Commonly asked in Statistics & Math interviews for data scientist roles, it tests both conceptual understanding (association versus causal direction) and practical application (sampling design effects and model diagnostic interpretation).

Related Interview Questions

  • Correct length-biased sampling from family-size survey - Upstart (easy)
  • Compute decay, OLS, and classic probability results - Upstart (easy)
  • Solve core probability/statistics mini-problems - Upstart (medium)
  • Combine noisy thermometers; compute random-walk correlations - Upstart (easy)
  • Analyze HT vs HH stopping-time probabilities - Upstart (medium)
Upstart logo
Upstart
Aug 4, 2025, 10:55 AM
Data Scientist
Onsite
Statistics & Math
82
0

On-site Statistics Round

Task Overview

You are given a population of families that have either 1, 2, or 3 children. You sample 100 children (i.e., the sampling unit is a child, not a family). For each sampled child, you can observe the size of their family.

Answer the following:

  1. Estimating family-type proportions
  • From the child sample, estimate the proportions of 1-child, 2-child, and 3-child families in the population of families (not in the population of children).
  • Construct 95% confidence intervals for those family-type proportions.
  1. OLS asymmetry and causality
  • Explain why the OLS slope from Y ~ X generally differs from the slope from X ~ Y.
  • Relate this to the distinction between association and causal direction.
  1. Prediction strong, coefficients insignificant
  • A regression shows all coefficients are statistically insignificant, yet the model predicts well. Provide a statistical explanation and propose fixes.

Hints: Multinomial proportions with size-bias correction and bootstrap CIs; regression asymmetry and reverse causality; multicollinearity and ridge/LASSO.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Upstart•More Data Scientist•Upstart Data Scientist•Upstart Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.