Consider the linear model y = Xb + ε with X ∈ R^{n×(m+1)} including an intercept. a) Derive the OLS estimator b̂ = (XᵀX)^{-1}Xᵀy, stating the rank conditions for identifiability and the sampling distribution of b̂ under classical assumptions. b) Now suppose m > n. Describe at least three viable approaches (e.g., ridge: b̂_ridge = (XᵀX + λI)^{-1}Xᵀy; lasso; elastic net; forward selection; PCA/PLS), including how you would choose λ and check generalization (cross‑validation details). c) When does the Moore–Penrose pseudoinverse give a reasonable minimum‑norm solution, and what are its drawbacks? d) Explain why naive upsampling of rows does not resolve rank deficiency and can harm inference.