Machine Learning Interview Questions
Practice the exact questions companies are asking right now.
How predict vehicles’ turn direction at intersection?
At an intersection, there are N vehicles stopped or moving slowly. For each vehicle you have historical time-series data up to the current time: - Pos...
Design a lead-scoring model
Context You are interviewing for a Data Scientist role on a marketing/growth team. The business wants lead scoring: ranking or scoring incoming leads ...
Explain KNN and how to tune it
K-Nearest Neighbors (KNN) fundamentals You are interviewing for a Data Scientist role. 1. Explain how the KNN algorithm works for both classification ...
Compare two rare-event detection models statistically
You are evaluating two models (Model A and Model B) for rare-event detection (e.g., fraud, abuse, medical adverse event). Positives are extremely rare...
Compare preference alignment methods for LLMs
Question You’re asked to discuss preference alignment approaches for large language models. Task Compare several alignment methods and explain when yo...
Debug and fix a PyTorch Transformer training loop
Minimal Causal LM Debugging and Optimization Context You are given a tiny causal decoder-only language model implemented in PyTorch. It appears to "tr...
Derive correlation bounds and omitted-variable bias
Core Statistics Prompt Answer the following related statistics questions. Part A — Pairwise correlation constraints Let \(X, Y, Z\) be random variable...
Implement and Debug Backprop in NumPy
Two-Layer Neural Network: Backpropagation and Gradient Check (NumPy) Context You are implementing a fully connected two-layer neural network for multi...
Compare Random Forests and Boosted Trees: Bias, Variance, Speed
Scenario A product/data science team is deciding between Random Forests and Gradient-Boosted Decision Trees (e.g., XGBoost) for a new predictive task....
Explain project details, PCA, and SHAP
Interview prompt (ML project deep dive) You are interviewing for a Data Scientist role. The interviewer asks you to pick one ML project you have perso...
Debug transformer and train classifier
Debug and Fix a Transformer Text Classifier, Then Train and Evaluate It Context You inherit a small codebase for a transformer-based text classifier. ...
Design a search relevance prediction approach
Search relevance prediction You are asked to predict relevance for an e-commerce search engine (given a user query and a product/document). Prompt 1. ...
Train a classifier and analyze dataset
End-to-End Binary Classifier Workflow (EDA → Modeling → Fairness → Report) You are given a labeled tabular dataset and asked to implement a reproducib...
Handle cold start, dropout, and training stability
Machine Learning deep dive Answer the following conceptual questions (you may use equations and small examples). A) Recommender systems: cold start 1....
Diagnose Transformer training and inference bugs
Debugging a Transformer That Intermittently Throws Shape/Type Errors and Fails to Converge You are given a Transformer-based sequence model that: - In...
Find minimum of unknown convex function
You are given access to an unknown univariate convex function \(f(x)\) defined on a closed interval \([L, R]\) on the real line. - You cannot see the ...
Compute and plot a precision–recall curve
Precision–Recall (PR) curve coding / evaluation You are given a binary classifier’s outputs on a dataset: - y_true: array of true labels in \(\{0,1\}\...
Debug a transformer training pipeline
Diagnose a Diverging PyTorch Transformer Training Run You are given a PyTorch Transformer training pipeline whose loss diverges and validation accurac...
Forecast bikes available at a station
Data Analysis / Forecasting Prompt You are given historical Citi Bike (bike-share) trip and station status data. Each station has a fixed dock capacit...
Explain key ML theory and techniques
Onsite Machine Learning Engineer: Mixed Topics You are asked to answer concisely but with depth across the following topics: 1) XGBoost Parallel Compu...