This question evaluates mastery of linear regression diagnostics and inference, including Gauss–Markov assumptions and their role for BLUE versus hypothesis testing, OLS residual properties and orthogonality, multicollinearity and VIF, the effects of predictor scaling, and robustness to heavy tails and heteroskedasticity.
Given a linear model y = Xβ + ε on 10,000 observations: (a) State all Gauss–Markov assumptions and which are needed for BLUE vs inference. (b) Show why OLS residuals sum to zero and why fitted residuals are orthogonal to the column space of X. (c) You observe multicollinearity among three predictors; define and compute VIF, and propose two remedies. (d) If you scale one predictor by a factor of 100, how do coefficients, standard errors, R^2, and predictions change? (e) Interpret a QQ-plot and residual-vs-fitted plot that show heavy tails and heteroskedasticity; propose robust alternatives (e.g., HC3, WLS) and appropriate hypothesis tests.