Data Scientist Data Manipulation (SQL/Python) Interview Questions
Practice the exact questions companies are asking right now.

"10 years of experience but never worked at a top company. PracHub's senior-level questions helped me break into FAANG at 35. Age is just a number."

"I was skeptical about the 'real questions' claim, so I put it to the test. I searched for the exact question I got grilled on at my last Meta onsite... and it was right there. Word for word."

"Got a Google recruiter call on Monday, interview on Friday. Crammed PracHub for 4 days. Passed every round. This platform is a miracle worker."

"I've used LC, Glassdoor, and random Discords. Nothing comes close to the accuracy here. The questions are actually current — that's what got me. Felt like I had a cheat sheet during the interview."

"The solution quality is insane. It covers approach, edge cases, time complexity, follow-ups. Nothing else comes close."

"Legit the only resource you need. TC went from 180k -> 350k. Just memorize the top 50 for your target company and you're golden."

"PracHub Premium for one month cost me the price of two coffees a week. It landed me a $280K+ starting offer."

"Literally just signed a $600k offer. I only had 2 weeks to prep, so I focused entirely on the company-tagged lists here. If you're targeting L5+, don't overthink it."

"Coaches and bootcamp prep courses cost around $200-300 but PracHub Premium is actually less than a Netflix subscription. And it landed me a $178K offer."

"I honestly don't know how you guys gather so many real interview questions. It's almost scary. I walked into my Amazon loop and recognized 3 out of 4 problems from your database."

"Discovered PracHub 10 days before my interview. By day 5, I stopped being nervous. By interview day, I was actually excited to show what I knew."
"The search is what sold me. I typed in a really niche DP problem I got asked last year and it actually came up, full breakdown and everything. These guys are clearly updating it constantly."
Compute ads revenue by geography in SQL
You have ad delivery logs for a shop-ads system. Tables ad_impressions - impression_id STRING (PK) - ts TIMESTAMP (UTC) - user_id STRING - shop_id STR...
Retrieve First Active and Last Inactive Dates per User
Given a table activity that tracks user activities, write a SQL query to retrieve the first active date and last inactive date for each user. Table Sc...
Compute pirated-theme usage and revenue loss
You work on a theme marketplace. Some shops install pirated themes instead of paying for official themes. Assume all timestamps are in UTC. Tables sho...
Transform DataFrame and compute diff-in-diff
You are given a pandas DataFrame df with the following columns: - unit_id (string): entity identifier (e.g., user, city, driver) - group (string): eit...
Write SQL for video-call recipients and FR activity
Given the schema and samples below, write ANSI‑SQL to answer both questions. Assume dates are stored in UTC. Today is 2025-09-01, so “yesterday” is 20...
Compute reply-based user metrics in 7 days
You are analyzing discussions on a social platform. Tables all_post - post_id (BIGINT, PK) - post_author_id (BIGINT, FK → user.user_id) - post_creatio...
Write SQL to compare social-only vs game-only engagement
You are given two tables capturing Oculus app usage. Define an 'active day' as a UTC date on which a user generates at least one event. Consider only ...
Write SQL for influence score and follower growth
You are working on a social product with these tables: Tables / Schemas users - user_id BIGINT (PK) - created_at TIMESTAMP posts - post_id BIGINT (PK)...
Compute percent of active users with 50+ calls
Problem You work on a Messenger-like app. You want to measure how many active users in Great Britain (GB) today have been heavy callers recently. Tabl...
Compute video-call SQL metrics with edge cases
Use 'today' = 2025-09-01. Assume UTC timestamps. Write SQL to answer both parts below and call out how your queries handle edge cases (duplicates, fai...
Write SQL for top categories and highly active users
You are given three tables: 1) impression Event-level table of user impressions. - impression_id BIGINT (PK) - user_id BIGINT (FK → user.user_id) - pi...
Calculate Response Rate and Compare User Survey Ratings
USERS user_id | signup_date 10 | 2024-03-20 11 | 2024-04-01 12 | 2024-04-05 SURVEYS survey_id | user_id | sent_at 1 | 10 ...
Analyze Spending Patterns and Restaurant Performance Using SQL/Python
orders +-------------+---------+---------------+---------------------+ | delivery_id | user_id | restaurant_id | order_date | +-------------+...
Compute active ad revenue by creation source
You work on an ads platform and need to report active ad revenue broken down by the ad’s creation source. Tables ads - ad_id BIGINT PK - advertiser_id...
Write SQL for percent and window changes
Use PostgreSQL. Assume today = 2025-09-01. You must use CTEs and multiple window functions. Schema and tiny samples are below. Schema: - exposures(uni...
Write SQL using HAVING and window functions
Context You work on fraud analytics. Assume the following schema (PostgreSQL-like types): transactions - txn_id BIGINT (PK) - merchant_id BIGINT - use...
Write SQL for late-delivery metrics by window
You are given two tables. Assume PostgreSQL. Define delivery duration as delivered_at − pickup_time (exclude rows with null pickup_time or delivered_a...
Merge overlapping intervals per group in pandas
You are given a pandas DataFrame df containing time intervals for multiple groups. Input df columns: - group_id (string/int): group identifier - start...
Write monthly customer and sales SQL queries
You are analyzing a food-delivery marketplace. Tables Assume the following schema (you may add minor helper CTEs as needed): orders - order_id (BIGINT...
Analyze time-zoned events with pandas
You are given two pandas DataFrames. events columns: user_id:int, ts:str ISO-8601 with timezone (e.g., '2025-08-31T23:58:43-07:00'), event:str in {'si...