PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Microsoft

Design a model for imbalanced conversions

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's ability to design and validate end-to-end predictive models for imbalanced binary outcomes, encompassing feature engineering, class imbalance handling, probability calibration, thresholding for budgeted targeting, validation, monitoring, and interpretability.

  • hard
  • Microsoft
  • Machine Learning
  • Data Scientist

Design a model for imbalanced conversions

Company: Microsoft

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

You ran a campaign to 10,000 customers; 500 purchased (5% positive class). Design an end-to-end approach to identify which customers are most likely to purchase. Requirements: - Start with a logistic regression baseline; detail your feature engineering (handling categoricals, scaling, interactions), data splitting, and prevention of leakage. - Address class imbalance: compare class_weight, random over/under-sampling, and SMOTE; specify which metric you’ll optimize and why (e.g., PR-AUC, recall at fixed precision, cost-sensitive loss). - Describe threshold selection for ranking vs. classification, probability calibration (Platt/isotonic), and how you would choose the top-N customers to target under a fixed budget. - Explain how you would validate with stratified cross-validation, report confidence intervals, and monitor post-deployment drift and lift. - List at least two feature selection methods (e.g., L1 penalty, mutual information, recursive feature elimination) and how you’d guard against overfitting while keeping interpretability.

Quick Answer: This question evaluates a data scientist's ability to design and validate end-to-end predictive models for imbalanced binary outcomes, encompassing feature engineering, class imbalance handling, probability calibration, thresholding for budgeted targeting, validation, monitoring, and interpretability.

Related Interview Questions

  • How do you choose a model? - Microsoft (medium)
  • Explain SHAP in an ML System - Microsoft (medium)
  • Explain normalization, regularization, CTR, imbalance handling - Microsoft (medium)
  • Clean OCR data and build an LLM dataset - Microsoft (medium)
  • Explain SHAP and build an ML project - Microsoft (easy)
Microsoft logo
Microsoft
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
5
0

Predicting Purchase Propensity After a Campaign (5% Positives)

You previously ran a marketing campaign to 10,000 customers and observed 500 purchases (5% positive rate). You now want to build a model to score customers for the next campaign so you can target those most likely to purchase under a fixed budget.

Design an end-to-end approach that includes:

  1. Baseline Model and Features
  • Start with logistic regression. Describe:
    • Feature engineering: numeric handling, categorical encoding, scaling, missing values, interactions.
    • Data splitting strategy (temporal/stratified), pipelines, and prevention of leakage.
  1. Class Imbalance
  • Compare class_weight, random over/under-sampling, and SMOTE. State which metric(s) you’ll optimize and why (e.g., PR-AUC, recall at fixed precision, cost-sensitive loss).
  1. Thresholding, Calibration, and Budgeted Targeting
  • Explain ranking vs. classification thresholds, probability calibration (Platt scaling or isotonic), and how to choose top-N customers to target under a fixed budget.
  1. Validation and Monitoring
  • Describe stratified cross-validation, how you will report confidence intervals, and how you will monitor post-deployment drift and business lift.
  1. Feature Selection and Interpretability
  • List at least two feature selection methods (e.g., L1 penalty, mutual information, recursive feature elimination) and how you would guard against overfitting while preserving interpretability.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Microsoft•More Data Scientist•Microsoft Data Scientist•Microsoft Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.