PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Boston Consulting Group

Explain AUC, imbalance, losses, and networks

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's understanding of imbalanced classification and regression concepts, including ROC/PR curves and AUC, prevalence effects on metrics, loss functions (MSE vs MAE) and their gradients, and neural network training strategies for calibration and recall.

  • medium
  • Boston Consulting Group
  • Machine Learning
  • Data Scientist

Explain AUC, imbalance, losses, and networks

Company: Boston Consulting Group

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Take-home Project

Answer all parts. A) Compute ROC points and ROC-AUC for the following toy scores by sweeping thresholds from +∞ down to −∞ (break ties by ranking positives above negatives at equal score): Positives: P1=0.99, P2=0.80, P3=0.60, P4=0.40, P5=0.10. Negatives: N1=0.95, N2=0.70, N3=0.55, N4=0.30, N5=0.05. Show the trapezoidal calculation and interpret the AUC as the probability a random positive ranks above a random negative. B) With prevalence = 1%, explain why accuracy can be misleading; propose two better metrics and when to prefer ROC-AUC vs PR-AUC. C) Contrast MSE vs MAE as training losses for a regression head: derive their gradients, explain robustness to outliers, and give one scenario where each is preferable. D) For a neural network binary classifier on the same imbalanced task, propose two architecture/training changes (e.g., focal loss with typical γ,α; class weighting; positive down/up-sampling; thresholding strategy) and discuss likely effects on calibration and recall.

Quick Answer: This question evaluates a candidate's understanding of imbalanced classification and regression concepts, including ROC/PR curves and AUC, prevalence effects on metrics, loss functions (MSE vs MAE) and their gradients, and neural network training strategies for calibration and recall.

Related Interview Questions

  • Design and sample for credit default prediction - Boston Consulting Group (Medium)
  • Build and evaluate imbalanced binary classifier - Boston Consulting Group (medium)
  • Reduce overfitting under constraints - Boston Consulting Group (hard)
  • Achieve 0.95 precision via thresholding - Boston Consulting Group (medium)
  • Build a leak-free sklearn pipeline - Boston Consulting Group (medium)
Boston Consulting Group logo
Boston Consulting Group
Oct 13, 2025, 9:49 PM
Data Scientist
Take-home Project
Machine Learning
5
0
Loading...

Imbalanced Classification & Regression: ROC/PR, Losses, and Training Strategies

You are evaluating a binary classifier and a regression head in a machine learning take-home. Answer all parts concisely but show your steps where calculations are requested.

A) ROC Curve and AUC from Toy Scores

Given scores for 5 positives and 5 negatives, sweep the decision threshold from +∞ down to −∞. At equal scores (if any), break ties by ranking positives above negatives.

  • Positives: P1 = 0.99, P2 = 0.80, P3 = 0.60, P4 = 0.40, P5 = 0.10
  • Negatives: N1 = 0.95, N2 = 0.70, N3 = 0.55, N4 = 0.30, N5 = 0.05

Tasks:

  1. List the ROC points (FPR, TPR) encountered as you lower the threshold.
  2. Compute ROC-AUC using the trapezoidal rule; show the segment-by-segment calculation.
  3. Interpret the AUC as the probability a random positive ranks above a random negative.

B) Metrics Under 1% Prevalence

With prevalence = 1% (positives are rare):

  1. Explain why overall accuracy can be misleading.
  2. Propose two better metrics for model selection.
  3. State when you would prefer ROC-AUC vs PR-AUC.

C) MSE vs MAE as Regression Losses

For a regression head:

  1. Write each loss and derive the gradient with respect to the prediction.
  2. Explain robustness to outliers and optimization behavior.
  3. Give one practical scenario where each is preferable.

D) Improving an Imbalanced Binary Classifier (Neural Network)

On the same 1% prevalence task, propose two concrete architecture/training changes (e.g., focal loss with typical γ, α; class weighting; positive down/up-sampling; thresholding strategy). For each, discuss likely effects on calibration and on recall.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Boston Consulting Group•More Data Scientist•Boston Consulting Group Data Scientist•Boston Consulting Group Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.