Linear Regression Technical Screen: OLS Assumptions and Multicollinearity
Context: You are asked to summarize core OLS assumptions, explain multicollinearity, and discuss diagnostics, remedies, and implications for the design matrix.
Tasks
-
List and explain the standard OLS assumptions:
-
Linearity in parameters
-
Independence / no autocorrelation
-
Homoscedasticity
-
Normality of errors (for exact small-sample inference)
-
No perfect multicollinearity
-
Correct specification (including exogeneity)
-
Define multicollinearity and describe its effects on:
-
Coefficient variance and stability
-
Confidence intervals and p-values
-
Note that OLS point estimates remain unbiased under exogeneity
-
Show how to diagnose multicollinearity:
-
Correlation matrix
-
Variance Inflation Factors (VIF) and common thresholds
-
Condition number
-
Eigenvalue-based analysis
-
Propose remedies and discuss trade-offs:
-
Collect more data
-
Drop/combine features
-
Center variables and interaction terms
-
Regularization (ridge/LASSO/elastic net)
-
Dimension reduction (PCA/PLS)
-
If two predictors are perfectly collinear, what happens to X'X and how do implementations typically handle it?