A bank wants a model to predict 90-day credit card default at account-month level for proactive outreach. Class prevalence in production is about 2% defaults. Design the end-to-end approach and address sampling in depth. a) Problem framing: Define the label precisely (observation window, prediction date, horizon), features available at scoring time, and a temporal data split that avoids leakage (e.g., train on data up to a cutoff date, validate on future months). List three concrete leakage risks unique to credit cards and how you would detect them. b) Metrics: Choose evaluation metrics and operating points for imbalanced data (e.g., PR-AUC, recall at 5% FPR, expected utility). Justify why they match business goals under 2% prevalence. c) Sampling strategy: You downsample negatives to speed training and target a 1:3 positive:negative ratio in the training set. Let π be true prevalence (0.02 in production) and π_s be the training prevalence (0.25 after downsampling). If the model outputs p_s = P(default | x, sampled) = 0.60 for an account, derive and compute the calibrated population probability p_pop = P(default | x) using prior-probability correction under prior shift: - odds_s = p_s / (1 - p_s) - odds_pop = odds_s × [(π / (1 - π)) / (π_s / (1 - π_s))] - p_pop = odds_pop / (1 + odds_pop) Provide the numeric p_pop and explain when class weights vs. probability recalibration are sufficient or when you need both. d) Threshold via cost-benefit: Outreach costs $2 per account. If an account would default and you contact them, there is a 30% chance the intervention averts an average $150 loss. Choose the action rule and compute the breakeven probability threshold p* for contacting, then decide whether to contact the account from part (c) using p_pop. e) Validation: Propose a time-based cross-validation scheme and a backtest showing stability under covariate and prior shift across regions and macro regimes. Include how you would monitor calibration and drift post-deployment and re-tune sampling if the true default rate changes from 2% to 1%. f) Fairness and operations: Name two fairness checks (e.g., equal opportunity across age groups where legally permitted) and one operational guardrail to prevent over-throttling credit limits due to model uncertainty. Explain how sampling choices can bias these checks if not corrected.

This question evaluates a data scientist's competencies in imbalanced binary classification, temporal label definition and leakage detection, sampling and calibration under prior-probability shift, metric and operating-point selection, cost-sensitive decision rules, time-based validation and monitoring, and fairness and operational guardrails.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a Medium difficulty Machine Learning question, commonly asked during Technical Screen rounds at Boston Consulting Group.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Boston Consulting Group during technical interviews.

Design and sample for credit default prediction | Boston Consulting Group Interview Question

A bank wants a model to predict 90-day credit card default at account-month level for proactive outreach. Class prevalence in production is about 2% defaults. Design the end-to-end approach and address sampling in depth.

a) Problem framing: Define the label precisely (observation window, prediction date, horizon), features available at scoring time, and a temporal data split that avoids leakage (e.g., train on data up to a cutoff date, validate on future months). List three concrete leakage risks unique to credit cards and how you would detect them.

b) Metrics: Choose evaluation metrics and operating points for imbalanced data (e.g., PR-AUC, recall at 5% FPR, expected utility). Justify why they match business goals under 2% prevalence.

c) Sampling strategy: You downsample negatives to speed training and target a 1:3 positive:negative ratio in the training set. Let π be true prevalence (0.02 in production) and π_s be the training prevalence (0.25 after downsampling). If the model outputs p_s = P(default | x, sampled) = 0.60 for an account, derive and compute the calibrated population probability p_pop = P(default | x) using prior-probability correction under prior shift:

odds_s = p_s / (1 - p_s)
odds_pop = odds_s × [(π / (1 - π)) / (π_s / (1 - π_s))]
p_pop = odds_pop / (1 + odds_pop) Provide the numeric p_pop and explain when class weights vs. probability recalibration are sufficient or when you need both.

d) Threshold via cost-benefit: Outreach costs $2 per account. If an account would default and you contact them, there is a 30% chance the intervention averts an average$ 150 loss. Choose the action rule and compute the breakeven probability threshold p* for contacting, then decide whether to contact the account from part (c) using p_pop.

e) Validation: Propose a time-based cross-validation scheme and a backtest showing stability under covariate and prior shift across regions and macro regimes. Include how you would monitor calibration and drift post-deployment and re-tune sampling if the true default rate changes from 2% to 1%.

f) Fairness and operations: Name two fairness checks (e.g., equal opportunity across age groups where legally permitted) and one operational guardrail to prevent over-throttling credit limits due to model uncertainty. Explain how sampling choices can bias these checks if not corrected.

b) Metrics: Choose evaluation metrics and operating points for imbalanced data (e.g., PR-AUC, recall at 5% FPR, expected utility). Justify why they match business goals under 2% prevalence.

odds_s = p_s / (1 - p_s)
odds_pop = odds_s × [(π / (1 - π)) / (π_s / (1 - π_s))]
p_pop = odds_pop / (1 + odds_pop) Provide the numeric p_pop and explain when class weights vs. probability recalibration are sufficient or when you need both.

Design and sample for credit default prediction

Quick Overview

Design and sample for credit default prediction

Write your answer

Design and sample for credit default prediction

Quick Overview

Design and sample for credit default prediction

Write your answer