Scenario
You are building a customer propensity model to predict the probability that a user will take a desired action (e.g., purchase, subscribe). You have mixed feature types from transactions, web/app activity, and demographics.
Task
Answer the following practical feature-engineering questions for this setting.
Questions
-
When should we standardize or normalize variables? (Discuss the impact of scaling on different algorithms.)
-
How would you handle numeric predictors that contain many null or zero values? (Discuss imputation versus flagging and approaches for zero-inflated features.)
-
If several features are highly correlated, how would you decide which one(s) to keep? (Discuss multicollinearity detection and remedies.)