This question evaluates understanding of regularization and feature selection in linear models, covering competencies in LASSO's L1 penalty versus L2, geometric constraint intuition, optimality/KKT conditions, effects of correlated predictors, the role of standardization, hyperparameter selection, and when Elastic Net is appropriate, within the Machine Learning domain for Data Scientist roles. It is commonly asked because it probes both conceptual understanding and practical application of model sparsity, interpretability, preprocessing, and bias–variance trade-offs, testing knowledge of statistical optimization and model selection rather than implementation details.

Explain why LASSO performs feature selection. Provide: 1) high-level intuition comparing L1 vs. L2 penalties; 2) geometric interpretation of the constraint region and why coefficients hit exact zero; 3) the KKT/subgradient condition for when a coefficient becomes zero; 4) the effect of correlated predictors on selection stability; 5) why standardization matters and what happens if you omit it; 6) how lambda is chosen and how it shifts bias–variance; 7) when Elastic Net is preferable and why.