Data Scientist Machine Learning Interview Questions

Master your tech interview with our curated database of real questions from top companies.

278Questions

235Companies

278 Questions 235 Companies

Daily Quest

Showing 20 results

TikTok

Medium

Data Scientist

Compare Random Forests and Boosted Trees: Bias, Variance, Speed

Scenario A product/data science team is deciding between Random Forests and Gradient-Boosted Decision Trees (e.g., XGBoost) for a new predictive task....

Build Model to Predict Customer Contract Renewal

Predicting Enterprise Customer Renewal for Google Meet You are tasked with designing a model to predict whether an enterprise customer will renew thei...

Build Predictive Model for Product Metric: Steps Explained

Scenario You are interviewing for a Data Scientist role and are asked to design a predictive model for a key product metric in a consumer app (e.g., p...

Determine Features for Effective Hashtag Recommendations

Hashtag Recommendation System Design Context You are designing a hashtag recommendation system for a social-media platform. Given a user u composing a...

How to Analyze and Model Behavioral Data Effectively?

End-to-End Conversion Modeling on a Raw Behavioral Dataset Scenario You receive a raw, event-level behavioral dataset (e.g., user actions, sessions, m...

Identify Unsupervised Techniques for Detecting Fraudulent Transactions

Unsupervised Fraud Detection: Modeling and Evaluation Without Labels Scenario You receive millions of historical transactions with no fraud labels. Ma...

Develop a Restaurant-Recommendation Engine with Logistic Regression

Restaurant Recommendation Engine: Metrics, Features, Model, and Evaluation Scenario You are designing a restaurant recommendation engine for a social ...

Identify Fake Accounts Using Machine Learning Techniques

Scenario You are a data scientist at a social‑commerce platform responsible for trust and safety. You need to design a system to detect and mitigate f...

Compare Logistic Regression and Random Forest in Limited Data Scenarios

Model Selection for Binary Classification with Limited Data and Potential Non-Linearities Scenario You are designing a binary classifier with limited ...

Optimize Surge Notifications for Rideshare Drivers

Scenario A rideshare marketplace experiences airport demand spikes. When demand exceeds supply, the system can send surge-pricing push notifications t...

Optimize Email Strategy for New Prime Video Series Launch

Scenario Designing, deploying, and evaluating ranking models and marketing emails for Prime Video. Question How would you approach sending marketing e...

Engineer Features to Enhance Smartphone Battery Life Prediction

Battery Life Prediction with Sparse History Problem You are given sparse discharge traces that record battery percentage over elapsed time for prior u...

Optimize Churn Prediction: Feature Engineering and Model Selection

Weekly Churn Prediction (10M users): Feature Engineering, Model Choice, Explainability, and Debugging Scenario You own a weekly churn-prediction pipel...

Design a Regression Model for Robust Extrapolation Performance

Scenario Onsite machine-learning exercise: your task is to build a regression model using only numerical features that not only fits training data but...

Design an ML Model for Interview Recommendation Pipeline

Scenario You are designing and deploying an ML model that mirrors a real-world recommendation pipeline serving a large product catalog with strict lat...

How to Architect a Personalized Ads Serving System

Full-Funnel Ads Serving System Design Scenario You are asked to architect a full-funnel advertising platform that serves personalized ads to users on ...

Evaluate and Experiment with Harmful Content Detection Model

Evaluating a Harmful-Content Detection Model: Offline and Online Context You are given a binary classification model that detects harmful content in a...

Evaluate Ensemble Models for Bias-Variance, Speed, and Interpretability

Large-Scale Recommendation System: Ensembles, Overfitting, Metrics, Architectures, and Optimization Context You are designing a large-scale recommenda...

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Scenario Classifying reviewers as lazy or careful with limited labels Context (completed) You are auditing a pool of reviewers who can be either: - La...

Design Framework for Robust House-Price Prediction Model

Model Robustness, Diagnostics, Random Forests, and Large-Scale Regression Context You are building and evaluating a supervised model to predict reside...

Machine Learning

Jul 12, 2025

...