PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Databricks

Relate coefficients under linear feature transformation

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of linear regression parameter relationships under linear feature transformations, model equivalence and identifiability, and detection and consequences of multicollinearity, testing competencies in linear algebra, statistical inference, and regression modeling.

  • easy
  • Databricks
  • Statistics & Math
  • Data Scientist

Relate coefficients under linear feature transformation

Company: Databricks

Role: Data Scientist

Category: Statistics & Math

Difficulty: easy

Interview Round: Technical Screen

Answer both parts. ## Part 1: Coefficients under feature transformation You have original predictors \(x_1, x_2\) and define transformed predictors: - \(z_1 = x_1 + x_2\) - \(z_2 = x_1 - x_2\) Consider two linear regression specifications (with the same response \(y\)): - **Model A:** \(y = \alpha_0 + \alpha_1 x_1 + \alpha_2 x_2 + \varepsilon\) - **Model B:** \(y = \beta_0 + \beta_1 z_1 + \beta_2 z_2 + \varepsilon\) 1. What is the relationship between \((\alpha_1, \alpha_2)\) and \((\beta_1, \beta_2)\)? 2. Are Model A and Model B **equivalent** (i.e., do they represent the same set of functions of \(x_1, x_2\))? State any assumptions. ## Part 2: Multicollinearity in a wage/gender regression You want to estimate/assess the relationship between **gender** and **income** using a regression model with additional covariates such as: - age - education level - years since receiving degree 1. Explain whether multicollinearity could be present and why. 2. Describe how you would **diagnose** it. 3. If it exists, explain multiple ways to **address** it, and discuss trade-offs (interpretability vs variance vs bias).

Quick Answer: This question evaluates understanding of linear regression parameter relationships under linear feature transformations, model equivalence and identifiability, and detection and consequences of multicollinearity, testing competencies in linear algebra, statistical inference, and regression modeling.

Related Interview Questions

  • Explain Linear Regression Assumptions - Databricks (hard)
  • Test coin fairness from 560 tails in 1000 flips - Databricks (hard)
  • Test if coin is fair from 560 tails - Databricks (easy)
  • Relate coefficients under linear feature transformation - Databricks (hard)
  • Diagnose and fix multicollinearity in income regression - Databricks (hard)
Databricks logo
Databricks
Dec 11, 2025, 12:00 AM
Data Scientist
Technical Screen
Statistics & Math
5
0
Loading...

Answer both parts.

Part 1: Coefficients under feature transformation

You have original predictors x1,x2x_1, x_2x1​,x2​ and define transformed predictors:

  • z1=x1+x2z_1 = x_1 + x_2z1​=x1​+x2​
  • z2=x1−x2z_2 = x_1 - x_2z2​=x1​−x2​

Consider two linear regression specifications (with the same response yyy):

  • Model A: y=α0+α1x1+α2x2+εy = \alpha_0 + \alpha_1 x_1 + \alpha_2 x_2 + \varepsilony=α0​+α1​x1​+α2​x2​+ε
  • Model B: y=β0+β1z1+β2z2+εy = \beta_0 + \beta_1 z_1 + \beta_2 z_2 + \varepsilony=β0​+β1​z1​+β2​z2​+ε
  1. What is the relationship between (α1,α2)(\alpha_1, \alpha_2)(α1​,α2​) and (β1,β2)(\beta_1, \beta_2)(β1​,β2​) ?
  2. Are Model A and Model B equivalent (i.e., do they represent the same set of functions of x1,x2x_1, x_2x1​,x2​ )? State any assumptions.

Part 2: Multicollinearity in a wage/gender regression

You want to estimate/assess the relationship between gender and income using a regression model with additional covariates such as:

  • age
  • education level
  • years since receiving degree
  1. Explain whether multicollinearity could be present and why.
  2. Describe how you would diagnose it.
  3. If it exists, explain multiple ways to address it, and discuss trade-offs (interpretability vs variance vs bias).

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Databricks•More Data Scientist•Databricks Data Scientist•Databricks Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.