Compute the ridge (L2) estimate w_ridge = (X'X + λI)^{-1} X'y for λ = 10. Report numeric weights to 2 decimal places and explain why L2 prefers sharing weight under high collinearity.
Without solving the full LASSO, argue which coefficient pattern the L1 solution is likely to produce for λ1 = 10: both similar and small, one near zero and the other large, or something else? Justify using the geometry of L1 vs L2 constraint sets under high collinearity.
Propose elastic-net penalties (α and λ) that would stabilize selection while controlling variance. Explain how you would tune α and λ and which validation metric you would pick if the goal is sparse interpretation with minimal loss in PR-AUC.