A bank wants a model to predict 90-day credit card default at account-month level for proactive outreach. Class prevalence in production is about 2% defaults. Design the end-to-end approach and address sampling in depth.
a) Problem framing: Define the label precisely (observation window, prediction date, horizon), features available at scoring time, and a temporal data split that avoids leakage (e.g., train on data up to a cutoff date, validate on future months). List three concrete leakage risks unique to credit cards and how you would detect them.
b) Metrics: Choose evaluation metrics and operating points for imbalanced data (e.g., PR-AUC, recall at 5% FPR, expected utility). Justify why they match business goals under 2% prevalence.
c) Sampling strategy: You downsample negatives to speed training and target a 1:3 positive:negative ratio in the training set. Let π be true prevalence (0.02 in production) and π_s be the training prevalence (0.25 after downsampling). If the model outputs p_s = P(default | x, sampled) = 0.60 for an account, derive and compute the calibrated population probability p_pop = P(default | x) using prior-probability correction under prior shift:
d) Threshold via cost-benefit: Outreach costs 150 loss. Choose the action rule and compute the breakeven probability threshold p* for contacting, then decide whether to contact the account from part (c) using p_pop.
e) Validation: Propose a time-based cross-validation scheme and a backtest showing stability under covariate and prior shift across regions and macro regimes. Include how you would monitor calibration and drift post-deployment and re-tune sampling if the true default rate changes from 2% to 1%.
f) Fairness and operations: Name two fairness checks (e.g., equal opportunity across age groups where legally permitted) and one operational guardrail to prevent over-throttling credit limits due to model uncertainty. Explain how sampling choices can bias these checks if not corrected.