Regularization choices for modeling contribution per order (p=50)
Context: You are building a linear model for contribution per order (continuous outcome) with about p = 50 covariates that include:
-
Highly correlated marketing dummy variables (e.g., overlapping campaigns, channels)
-
Weather variables
-
Daypart indicators
Assume predictors are standardized and that a binary treatment indicator D (e.g., exposed vs. not exposed to a marketing action) is of substantive interest for inference.
Tasks
-
Lasso vs. Ridge
-
Explain the bias–variance trade‑offs of L1 (Lasso) and L2 (Ridge).
-
Contrast their variable selection behavior under correlated groups of predictors.
-
Discuss how each affects uncertainty quantification for treatment effects, including best practices to avoid bias in the estimated treatment coefficient.
-
Elastic Net and tuning for valid inference
-
Describe when Elastic Net strictly dominates using Lasso or Ridge alone in this setting.
-
Explain how you would tune α and λ via cross‑validation, and how to keep inference valid after model selection (e.g., post‑selection refitting, stability selection).
-
Interactions and heterogeneity
-
Discuss how regularization interacts with collinearity when you include treatment×covariate interactions (D×X), and the risk of shrinking true heterogeneous effects to zero.