Technical Screening: Model Development Discussion
Context
You are building classification and regression models on tabular business data with missing values and potential outliers. You must choose appropriate data treatments, evaluation metrics, and modeling approaches suitable for production.
Tasks
-
Missing Values
-
Describe at least two methods to handle missing values in a training set.
-
For each method, state the pros and cons.
-
Outliers
-
Provide two strategies for treating outliers.
-
Explain when you would prefer each strategy.
-
Model Evaluation
-
Which metrics would you use to evaluate classification and regression models?
-
Justify your choices and note any pitfalls.
-
Algorithm Walkthrough
-
Pick one machine-learning algorithm and explain how it works step by step.
-
XGBoost Hyperparameters
-
List the key hyperparameters in XGBoost and explain their impact on the model.