
Explain why LASSO performs feature selection. Provide: 1) high-level intuition comparing L1 vs. L2 penalties; 2) geometric interpretation of the constraint region and why coefficients hit exact zero; 3) the KKT/subgradient condition for when a coefficient becomes zero; 4) the effect of correlated predictors on selection stability; 5) why standardization matters and what happens if you omit it; 6) how lambda is chosen and how it shifts bias–variance; 7) when Elastic Net is preferable and why.