This question evaluates competency in handling dataset shift and selection bias, adapting and recalibrating credit-risk models for an unobserved applicant subpopulation, and reasoning about model robustness, statistical diagnostics, and safe deployment constraints.
Your current probability-of-default (PD) lending model was trained only on applicants with credit scores ≥ 650 because those were historically considered for lending. Management now wants to evaluate and potentially lend to applicants with scores < 650.
How would you leverage the existing model and available data to score the < 650 population, while addressing dataset shift and selection bias? Outline a practical plan that:
Hints: covariate/selection shift, importance weighting, reject inference, semi-supervised learning, synthetic augmentation, boundary expansion, monotonic constraints, and conservative calibration.
Login required