PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Two Sigma

Derive correlation bounds and omitted-variable bias

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of multivariate correlation structure and linear regression properties, focusing on feasible ranges and constructions for equal pairwise correlations and the derivation of omitted-variable bias in ordinary least squares.

  • hard
  • Two Sigma
  • Machine Learning
  • Data Scientist

Derive correlation bounds and omitted-variable bias

Company: Two Sigma

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

## Core Statistics Prompt Answer the following related statistics questions. ### Part A — Pairwise correlation constraints Let \(X, Y, Z\) be random variables with unit variance and **equal pairwise correlation**: \[ \mathrm{Corr}(X,Y)=\mathrm{Corr}(Y,Z)=\mathrm{Corr}(X,Z)=p. \] 1. What values of \(p\) are feasible? 2. Give a method to **construct** \((X,Y,Z)\) that achieves any feasible \(p\). 3. Generalize: for **\(n\)** variables with the same pairwise correlation \(p\), what is the feasible range of \(p\)? How would you construct them? ### Part B — Omitted variable bias Consider the true linear regression model: \[ \mathbf{y}=X_1\beta_1 + X_2\beta_2 + \varepsilon, \] but you mistakenly fit the reduced model \(\mathbf{y}=X_1\tilde\beta_1+\text{error}\), omitting \(X_2\). 1. What is the impact on the estimated coefficient \(\tilde\beta_1\)? 2. Prove the result using matrix notation (OLS).

Quick Answer: This question evaluates understanding of multivariate correlation structure and linear regression properties, focusing on feasible ranges and constructions for equal pairwise correlations and the derivation of omitted-variable bias in ordinary least squares.

Related Interview Questions

  • Analyze Temperatures and Update Regression - Two Sigma (medium)
  • How would you forecast bike demand? - Two Sigma (hard)
  • Predict Bike Dock Demand - Two Sigma (hard)
  • Predict bike demand and avoid overfitting - Two Sigma (hard)
  • How detect duplicate card records? - Two Sigma (medium)
Two Sigma logo
Two Sigma
Jan 6, 2026, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
9
0
Loading...

Core Statistics Prompt

Answer the following related statistics questions.

Part A — Pairwise correlation constraints

Let X,Y,ZX, Y, ZX,Y,Z be random variables with unit variance and equal pairwise correlation:

Corr(X,Y)=Corr(Y,Z)=Corr(X,Z)=p.\mathrm{Corr}(X,Y)=\mathrm{Corr}(Y,Z)=\mathrm{Corr}(X,Z)=p.Corr(X,Y)=Corr(Y,Z)=Corr(X,Z)=p.

  1. What values of ppp are feasible?
  2. Give a method to construct (X,Y,Z)(X,Y,Z)(X,Y,Z) that achieves any feasible ppp .
  3. Generalize: for nnn variables with the same pairwise correlation ppp , what is the feasible range of ppp ? How would you construct them?

Part B — Omitted variable bias

Consider the true linear regression model:

y=X1β1+X2β2+ε,\mathbf{y}=X_1\beta_1 + X_2\beta_2 + \varepsilon,y=X1​β1​+X2​β2​+ε,

but you mistakenly fit the reduced model y=X1β~1+error\mathbf{y}=X_1\tilde\beta_1+\text{error}y=X1​β~​1​+error, omitting X2X_2X2​.

  1. What is the impact on the estimated coefficient β~1\tilde\beta_1β~​1​ ?
  2. Prove the result using matrix notation (OLS).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Two Sigma•More Data Scientist•Two Sigma Data Scientist•Two Sigma Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.