How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Technical Screen rounds at Microsoft.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Microsoft during technical interviews.

Design a model for imbalanced conversions | Microsoft Interview Question

Quick Overview

This question evaluates a data scientist's ability to design and validate end-to-end predictive models for imbalanced binary outcomes, encompassing feature engineering, class imbalance handling, probability calibration, thresholding for budgeted targeting, validation, monitoring, and interpretability.

Predicting Purchase Propensity After a Campaign (5% Positives)

You previously ran a marketing campaign to 10,000 customers and observed 500 purchases (5% positive rate). You now want to build a model to score customers for the next campaign so you can target those most likely to purchase under a fixed budget.

Design an end-to-end approach that includes:

Baseline Model and Features

Start with logistic regression. Describe:
- Feature engineering: numeric handling, categorical encoding, scaling, missing values, interactions.
- Data splitting strategy (temporal/stratified), pipelines, and prevention of leakage.

Class Imbalance

Compare class_weight, random over/under-sampling, and SMOTE. State which metric(s) you’ll optimize and why (e.g., PR-AUC, recall at fixed precision, cost-sensitive loss).

Thresholding, Calibration, and Budgeted Targeting

Explain ranking vs. classification thresholds, probability calibration (Platt scaling or isotonic), and how to choose top-N customers to target under a fixed budget.

Validation and Monitoring

Describe stratified cross-validation, how you will report confidence intervals, and how you will monitor post-deployment drift and business lift.

Feature Selection and Interpretability

List at least two feature selection methods (e.g., L1 penalty, mutual information, recursive feature elimination) and how you would guard against overfitting while preserving interpretability.

Quick Overview

Predicting Purchase Propensity After a Campaign (5% Positives)

Design an end-to-end approach that includes:

Baseline Model and Features

Start with logistic regression. Describe:

Feature engineering: numeric handling, categorical encoding, scaling, missing values, interactions.
Data splitting strategy (temporal/stratified), pipelines, and prevention of leakage.

Class Imbalance

Compare class_weight, random over/under-sampling, and SMOTE. State which metric(s) you’ll optimize and why (e.g., PR-AUC, recall at fixed precision, cost-sensitive loss).

Thresholding, Calibration, and Budgeted Targeting

Explain ranking vs. classification thresholds, probability calibration (Platt scaling or isotonic), and how to choose top-N customers to target under a fixed budget.

Validation and Monitoring

Describe stratified cross-validation, how you will report confidence intervals, and how you will monitor post-deployment drift and business lift.

Feature Selection and Interpretability

List at least two feature selection methods (e.g., L1 penalty, mutual information, recursive feature elimination) and how you would guard against overfitting while preserving interpretability.

Design a model for imbalanced conversions

Quick Overview

Design a model for imbalanced conversions

Predicting Purchase Propensity After a Campaign (5% Positives)

Write your answer

Design a model for imbalanced conversions

Quick Overview

Design a model for imbalanced conversions

Predicting Purchase Propensity After a Campaign (5% Positives)

Write your answer