Scenario
You are building a production-grade credit-risk scoring model (predicting probability of default within a fixed horizon) for Capital One. The model will be used for underwriting decisions and must meet performance, compliance, and interpretability requirements.
Task
Compare logistic regression, random forest, and gradient boosting for credit-risk modeling. For each, discuss pros and cons in this context. Then describe how you would:
-
Evaluate model performance (both discrimination and calibration), including appropriate train/validation splits.
-
Handle class imbalance in defaults.
-
Ensure model interpretability and compliance-readiness.
Include specific metrics (e.g., ROC-AUC, KS), imbalance techniques (e.g., class weighting, SMOTE), and explainability approaches (e.g., SHAP) and how they fit into a regulated credit environment.