You must deploy a classifier on an imbalanced dataset (50,000 samples; ~5% positives). Product requires test-set precision ≥ 0.95 for the positive class. 1) Train any probabilistic model; on a held-out validation set, compute the precision–recall curve and select the smallest probability threshold τ achieving precision ≥ 0.95 while maximizing recall (break ties by higher recall, then by lower threshold). 2) Fix τ and evaluate once on the untouched test set; report precision, recall, number of predicted positives, and the expected number of false positives if you flag 1,000 items. 3) If no τ on validation attains 0.95 precision, propose two actionable strategies (e.g., abstain/top-k policy, recalibration, different model or features) and explain risks. 4) Provide sklearn-style code to search τ (using precision_recall_curve or make_scorer with a custom threshold) without leaking test data. 5) Explain how class imbalance and calibration affect your ability to meet the precision constraint and why optimizing ROC AUC would be misleading here.

This question evaluates probabilistic classifier calibration, threshold selection to achieve a target precision, handling severe class imbalance, and precision–recall based evaluation metrics.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a medium difficulty Machine Learning question, commonly asked during Take-home Project rounds at Boston Consulting Group.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Boston Consulting Group during technical interviews.

Achieve 0.95 precision via thresholding | Boston Consulting Group Interview Question

Deploying a High-Precision Classifier on an Imbalanced Dataset

You are given a binary classification problem with 50,000 samples and ~5% positives. The product requires test-set precision ≥ 0.95 for the positive class. Assume you can train any probabilistic classifier that outputs calibrated scores/probabilities for the positive class.

Tasks

Threshold selection on validation

Train any probabilistic model on a training set.
On a held-out validation set, compute the precision–recall (PR) curve.
Among thresholds whose precision ≥ 0.95, select the threshold τ that maximizes recall; if multiple thresholds yield the same recall, pick the smallest threshold.

Final evaluation on test

Fix τ from validation.
Evaluate once on the untouched test set and report:
- Precision and recall (positive class)
- Number of predicted positives
- Expected number of false positives if you flag 1,000 items

If no τ on validation reaches precision ≥ 0.95

Propose two actionable strategies (e.g., abstain/top-k policy, recalibration, new model/features) and explain risks.

Provide sklearn-style code

Show how to search τ using precision_recall_curve (or using make_scorer with a custom threshold) without leaking test data.

Explain considerations

Explain how class imbalance and calibration affect your ability to meet the precision constraint, and why optimizing ROC AUC would be misleading here.

Assumptions: Use a proper train/validation/test split with stratification; select τ only on validation; test set remains untouched until final evaluation.

Tasks

Threshold selection on validation

Train any probabilistic model on a training set.

On a held-out validation set, compute the precision–recall (PR) curve.

Among thresholds whose precision ≥ 0.95, select the threshold τ that maximizes recall; if multiple thresholds yield the same recall, pick the smallest threshold.

Final evaluation on test

Fix τ from validation.

Evaluate once on the untouched test set and report:

Precision and recall (positive class)
Number of predicted positives
Expected number of false positives if you flag 1,000 items

If no τ on validation reaches precision ≥ 0.95

Propose two actionable strategies (e.g., abstain/top-k policy, recalibration, new model/features) and explain risks.

Provide sklearn-style code

Show how to search τ using precision_recall_curve (or using make_scorer with a custom threshold) without leaking test data.

Explain considerations

Explain how class imbalance and calibration affect your ability to meet the precision constraint, and why optimizing ROC AUC would be misleading here.

Assumptions: Use a proper train/validation/test split with stratification; select τ only on validation; test set remains untouched until final evaluation.

Achieve 0.95 precision via thresholding

Quick Overview

Achieve 0.95 precision via thresholding

Deploying a High-Precision Classifier on an Imbalanced Dataset

Tasks

Write your answer

Achieve 0.95 precision via thresholding

Quick Overview

Achieve 0.95 precision via thresholding

Deploying a High-Precision Classifier on an Imbalanced Dataset

Tasks

Write your answer