Data Scientist Interview Questions
Practice 2,964 real Data Scientist interview questions for 2026. Data Scientist interview questions drawn from Meta, Capital One, Amazon, Google, TikTok and similar employers — real questions from actual interviews with detailed solutions — designed to accelerate your interview preparation for product analytics, ML and production data roles. This collection emphasizes the practical skills interviewers test: SQL and data manipulation, experiment design and A/B testing, statistical reasoning, Python coding for data problems, model evaluation and feature engineering, plus machine-learning system tradeoffs and metric design. What’s distinctive about modern data-science loops is the blend of product thinking and reproducible ML: expect hands-on SQL tasks and funnel analysis in screens, deeper experiment-design and causality questions in mid rounds, and coding or modeling challenges plus ML-system discussions in senior loops. Interviewers evaluate problem framing, statistical rigor, and how you communicate decisions to product partners. To prepare, prioritize daily SQL practice (CTEs, window functions), refresh hypothesis-testing and power calculations, rehearse concise metric-driven narratives, and build a few end-to-end model or experiment stories you can explain clearly under time pressure.

"I got asked a hardcore MCM DP question and I saw it on PracHub as well. Solved that question in 5 minutes. Without PracHub I doubt I could solve it in 5 hours. Though somehow didn't get hired, perhaps I guess I solved it too fast? /s"

"Believe me i'm a student here jn US. Recently interviewed for MSFT. They asked me exact question from PracHub. I saw it the night before and ignored it cause why waste time on random sites. I legit wanna go back and redo this whole thing if I had chance. Not saying will work for everyone but there is certainly some merit to that website. And i'm gonna use it in future prep from now on like lc tagged"

"10 years of experience but never worked at a top company. PracHub's senior-level questions helped me break into FAANG at 35. Age is just a number."

"I was skeptical about the 'real questions' claim, so I put it to the test. I searched for the exact question I got grilled on at my last Meta onsite... and it was right there. Word for word."

"Got a Google recruiter call on Monday, interview on Friday. Crammed PracHub for 4 days. Passed every round. This platform is a miracle worker."

"I've used LC, Glassdoor, and random Discords. Nothing comes close to the accuracy here. The questions are actually current — that's what got me. Felt like I had a cheat sheet during the interview."

"The solution quality is insane. It covers approach, edge cases, time complexity, follow-ups. Nothing else comes close."

"Legit the only resource you need. TC went from 180k -> 350k. Just memorize the top 50 for your target company and you're golden."

"PracHub Premium for one month cost me the price of two coffees a week. It landed me a $280K+ starting offer."

"Literally just signed a $600k offer. I only had 2 weeks to prep, so I focused entirely on the company-tagged lists here. If you're targeting L5+, don't overthink it."

"Coaches and bootcamp prep courses cost around $200-300 but PracHub Premium is actually less than a Netflix subscription. And it landed me a $178K offer."

"I honestly don't know how you guys gather so many real interview questions. It's almost scary. I walked into my Amazon loop and recognized 3 out of 4 problems from your database."

"Discovered PracHub 10 days before my interview. By day 5, I stopped being nervous. By interview day, I was actually excited to show what I knew."

"I recently cleared Uber interviews (strong hire in the design round) and all the questions were present in prachub."
"The search is what sold me. I typed in a really niche DP problem I got asked last year and it actually came up, full breakdown and everything. These guys are clearly updating it constantly."
Design and analyze A/B test with interference
You must ship a News Feed ranking change where content produced by treated users can be seen by control users, creating interference and within-user c...
Contrast OLS, DiD, and PSM assumptions
For the shuttle impact problem, contrast OLS, DiD, and PSM rigorously. Do the following: 1) write the OLS and two-way fixed-effects DiD regression equ...
Design video-ads experiment and handle null results
You are launching a new video-ad format. Design an end-to-end A/B test to evaluate it against the current ad format. Be precise: 1) Define exposure an...
Design ML ranking for query suggestions
Re-rank Query Suggestions for Autocomplete Context You are building a re-ranking system for search autocomplete. For each keystroke, a candidate gener...
Implement streaming autocomplete under tight memory
Build an autocomplete service: Input: a list of up to 5,000,000 UTF-8 words (may include accents and hyphens), each with an integer popularity score t...
Map stakeholders and influence routes
Behavioral Scenario: Coordinating Multiple Recruiters Across Teams (Data Scientist, Onsite) Context You're at the onsite stage for a Data Scientist ro...
Compare first-score vs all-scores estimators
You have two candidate estimators for survey quality based on the score column over 2025-08-26 to 2025-09-01: - E_first: For each user×survey pair, ta...
Implement multiplication without using the multiplication operator
Implement int multiply(int a, int b) without using * or /. You may use +, −, bitwise operators, and shifts. Requirements: - Handle negatives, zero, an...
Explain linear regression to non‑technical stakeholders
Explain linear regression to a non-technical executive using a concrete business example (e.g., predicting weekly sales from price, ad spend, and stor...
Influence a senior partner with data
Describe a time you had to influence a senior cross-functional leader to change a launch plan based on ambiguous A/B test results. Be specific: the de...
Design and analyze a banner A/B test
A/B Test Design: Home-Page Banner You are deciding whether to add a home-page banner in a consumer app. Design and analyze the A/B test end-to-end. As...
Explain Random Forest randomness and implications
Random Forest — Rigor and Practical Choices Context: You are building a binary classifier with a Random Forest. The dataset has 100,000 rows, 100 feat...
Build and evaluate a conversion prediction model
Predicting 7-Day Purchase After Email Send Context You are given a CSV where each row is a user–email send (or scheduled send/control), with columns: ...
Write SQL to analyze response accuracy and speed
You are given response-level data for an online assessment with sections verbal/design/analytics and verbal subtypes grammar/vocab/tense/other. Using ...
Compute counts and pacing for verbal section
Verbal Section Allocation and Time Optimization You are designing a 15-minute verbal section (900 seconds total) with 19 questions across four subtype...
Demonstrate behavioral problem-solving with STAR
Data Scientist Onsite — Behavioral & Leadership (Use STAR) Answer concisely using the STAR framework (Situation, Task, Action, Result). Prepare brief,...
Design MapReduce and Spark jobs
Big data systems: (a) Explain Hadoop’s fault tolerance (HDFS replication, task re-execution) and why MapReduce includes shuffling and sorting; in a wo...
Implement KNN from scratch
Without using ML libraries, implement k-Nearest Neighbors for classification. Requirements: (a) Support Euclidean and cosine distances; (b) Allow tie-...
Explain an ML project end-to-end with tradeoffs
Pick one of your production ML projects and walk through it end-to-end. Be specific: 1) Problem framing (prediction vs causal decisioning), target def...
Test two models' proportions for significance
Two search models, A and B, were each used once by 100 distinct users (one query per user). Success is defined per query by your composite metric (suc...