PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Optimize precision–recall under class imbalance

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in model evaluation under severe class imbalance, covering precision, recall, F1, precision@k, threshold selection, calibration, and interpretation of PR versus ROC curves in the Machine Learning domain.

  • Medium
  • Amazon
  • Machine Learning
  • Data Scientist

Optimize precision–recall under class imbalance

Company: Amazon

Role: Data Scientist

Category: Machine Learning

Difficulty: Medium

Interview Round: Technical Screen

You have extreme class imbalance (positive rate ~1%). You score 12 examples as follows (id, true_label, score): A,1,0.92; B,0,0.90; C,0,0.88; D,0,0.70; E,1,0.62; F,0,0.58; G,0,0.55; H,0,0.54; I,1,0.53; J,0,0.50; K,0,0.20; L,0,0.10. Tasks: 1) Compute precision, recall, and F1 at thresholds of 0.90, 0.60, and 0.50. 2) Which threshold maximizes F1 here, and why might business costs still argue for a different threshold? 3) Explain when PR curves are more informative than ROC curves and what AUPRC vs AUROC would indicate in this setting. 4) If you must deliver exactly top-k alerts (k=2), compute precision@k and recall@k and discuss how calibration affects thresholding.

Quick Answer: This question evaluates a data scientist's competency in model evaluation under severe class imbalance, covering precision, recall, F1, precision@k, threshold selection, calibration, and interpretation of PR versus ROC curves in the Machine Learning domain.

Related Interview Questions

  • LLM Fundamentals: Tokenization Design and KL-Regularized SFT - Amazon (medium)
  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
|Home/Machine Learning/Amazon

Optimize precision–recall under class imbalance

Amazon logo
Amazon
Oct 13, 2025, 9:49 PM
MediumData ScientistTechnical ScreenMachine Learning
9
0

You have extreme class imbalance (positive rate ~1%). You score 12 examples as follows (id, true_label, score): A,1,0.92; B,0,0.90; C,0,0.88; D,0,0.70; E,1,0.62; F,0,0.58; G,0,0.55; H,0,0.54; I,1,0.53; J,0,0.50; K,0,0.20; L,0,0.10. Tasks: 1) Compute precision, recall, and F1 at thresholds of 0.90, 0.60, and 0.50. 2) Which threshold maximizes F1 here, and why might business costs still argue for a different threshold? 3) Explain when PR curves are more informative than ROC curves and what AUPRC vs AUROC would indicate in this setting. 4) If you must deliver exactly top-k alerts (k=2), compute precision@k and recall@k and discuss how calibration affects thresholding.

Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.