How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

What difficulty level is this interview question?

This is a easy difficulty Statistics & Math question, commonly asked during Technical Screen rounds at Microsoft.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Microsoft during technical interviews.

Use confusion matrix to choose model metric

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of confusion matrix components, mapping Type I/Type II errors to false positives/negatives, selection and interpretation of classification metrics (accuracy, precision/recall, F1, ROC-AUC, PR-AUC, calibration), threshold choice under asymmetric costs, and identification of practical pitfalls like class imbalance, data leakage, shifting base rates, and calibration issues. It is commonly asked in Statistics & Math interviews for Data Scientist roles to assess the ability to translate model performance into business impact, testing both conceptual understanding and practical application of evaluation and cost-sensitive decision-making.

Microsoft

Feb 9, 2026, 11:59 AM

Data Scientist

Technical Screen

Statistics & Math

Scenario

You built a binary classifier (e.g., fraud detection, churn risk, medical screening, spam).

You are given a confusion matrix on a validation set:

True Positives (TP)
False Positives (FP)
True Negatives (TN)
False Negatives (FN)

Questions

Explain what each confusion matrix cell means in the context of the business scenario.
Define Type I and Type II errors and map them to FP/FN.
Which evaluation metrics would you report (e.g., accuracy, precision/recall, F1, ROC-AUC, PR-AUC, calibration, expected cost)? When is each appropriate?
How would you choose an operating threshold given asymmetric costs?
What pitfalls should you watch for (class imbalance, data leakage, shifting base rates, calibration issues)?

Solution

Show

Comments (0)

Loading comments...

Browse More Questions

More Statistics & Math•More Microsoft•More Data Scientist•Microsoft Data Scientist•Microsoft Statistics & Math•Data Scientist Statistics & Math