PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Boston Consulting Group

Explain AUC, activations, ensembles, and imbalance

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in model evaluation metrics (ROC AUC and Average Precision), handling class imbalance, choice of output activations and loss functions, robustness to outliers (MSE vs MAE), ensemble methods, and overfitting diagnostics within the Machine Learning domain.

  • medium
  • Boston Consulting Group
  • Machine Learning
  • Data Scientist

Explain AUC, activations, ensembles, and imbalance

Company: Boston Consulting Group

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Take-home Project

Answer all sub-questions precisely. AUC/Ranking: Given scores s = [0.10, 0.40, 0.35, 0.80, 0.60] and labels y = [0, 1, 0, 1, 0], compute the ROC AUC exactly via pairwise positive–negative comparisons (no library). Then, draw the ROC points and compute the area by trapezoids; both methods should match. How would extreme class imbalance (1% positives) change how you interpret AUC vs Average Precision? Activations: For each scenario, pick the output-layer activation and loss, and justify: (a) single-label multi-class (K=7), (b) multi-label (K=7), (c) bounded regression in [0,1], (d) unbounded regression with outliers. Discuss vanishing gradients for sigmoid/tanh and why leaky-ReLU or GELU might help in hidden layers. MSE vs MAE: Explain the optimization and robustness differences (gradients, influence of outliers, median vs mean optimality). Ensembles: Contrast bagging vs boosting in terms of bias/variance and when you’d choose each for noisy data. Overfitting: Name two concrete, testable diagnostics (with plots/metrics) and two mitigation tactics that won’t leak validation information.

Quick Answer: This question evaluates competency in model evaluation metrics (ROC AUC and Average Precision), handling class imbalance, choice of output activations and loss functions, robustness to outliers (MSE vs MAE), ensemble methods, and overfitting diagnostics within the Machine Learning domain.

Related Interview Questions

  • Design and sample for credit default prediction - Boston Consulting Group (Medium)
  • Explain AUC, imbalance, losses, and networks - Boston Consulting Group (medium)
  • Build and evaluate imbalanced binary classifier - Boston Consulting Group (medium)
  • Reduce overfitting under constraints - Boston Consulting Group (hard)
  • Achieve 0.95 precision via thresholding - Boston Consulting Group (medium)
Boston Consulting Group logo
Boston Consulting Group
Oct 13, 2025, 9:49 PM
Data Scientist
Take-home Project
Machine Learning
3
0

Machine Learning Metrics and Modeling Choices — Multi-part

You are given model scores and binary labels for a small dataset and asked to compute ROC AUC manually, then answer modeling and evaluation questions.

Given:

  • Scores s = [0.10, 0.40, 0.35, 0.80, 0.60]
  • Labels y = [0, 1, 0, 1, 0]

Answer all sub-questions precisely:

1) AUC / Ranking

  1. Compute ROC AUC exactly via pairwise positive–negative comparisons (no libraries). Treat ties as 0.5 if any.
  2. List the ROC points (FPR, TPR) by thresholding from highest score to lowest and compute the AUC by trapezoids; both methods should match.
  3. With extreme class imbalance (1% positives), explain how you would interpret AUC vs Average Precision (AP) and which you would favor.

2) Output Activations and Losses

For each scenario, choose an output-layer activation and loss, and justify:

  • (a) Single-label multi-class (K = 7)
  • (b) Multi-label (K = 7)
  • (c) Bounded regression in [0, 1]
  • (d) Unbounded regression with outliers

Also discuss vanishing gradients for sigmoid/tanh and why leaky-ReLU or GELU might help in hidden layers.

3) MSE vs MAE

Explain optimization and robustness differences: gradients, influence of outliers, and mean vs median optimality.

4) Ensembles

Contrast bagging vs boosting in terms of bias/variance and when you’d choose each for noisy data.

5) Overfitting

Name two concrete, testable diagnostics (with plots/metrics) and two mitigation tactics that won’t leak validation information.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Boston Consulting Group•More Data Scientist•Boston Consulting Group Data Scientist•Boston Consulting Group Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.