Answer both parts.
Part 1: Coefficients under feature transformation
You have original predictors x1,x2 and define transformed predictors:
-
z1=x1+x2
-
z2=x1−x2
Consider two linear regression specifications (with the same response y):
-
Model A:
y=α0+α1x1+α2x2+ε
-
Model B:
y=β0+β1z1+β2z2+ε
-
What is the relationship between
(α1,α2)
and
(β1,β2)
?
-
Are Model A and Model B
equivalent
(i.e., do they represent the same set of functions of
x1,x2
)? State any assumptions.
Part 2: Multicollinearity in a wage/gender regression
You want to estimate/assess the relationship between gender and income using a regression model with additional covariates such as:
-
age
-
education level
-
years since receiving degree
-
Explain whether multicollinearity could be present and why.
-
Describe how you would
diagnose
it.
-
If it exists, explain multiple ways to
address
it, and discuss trade-offs (interpretability vs variance vs bias).