Data Scientist Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Retrieve First Active and Last Inactive Dates per User
Given a table activity that tracks user activities, write a SQL query to retrieve the first active date and last inactive date for each user. Table Sc...
Compare Random Forests and Boosted Trees: Bias, Variance, Speed
Scenario A product/data science team is deciding between Random Forests and Gradient-Boosted Decision Trees (e.g., XGBoost) for a new predictive task....
Fake Accounts [AE]
Detecting and Managing Bad Accounts on a Social Platform 1) Probability of a Bad Account Sending Friend Requests Context: 1% of accounts are bad. Bad ...
Identify User Interest in Group Video Calls Using Data
Group Video-Calling Feature Analysis Context You are asked to design, launch, and analyze a new group video-calling feature for a large social/messagi...
Evaluating a 15 % reduction in post‑card height
Scenario You own the feed UX for a social app. Designers propose shrinking each post card’s height by 15% to show more content per scroll, aiming to i...
Select Top Customers Using Transaction Data Filters
transactions +----+---------+------------+--------+ | id | user_id | order_date | amount | +----+---------+------------+--------+ | 1 | 101 | 202...
Calculate Response Rate and Compare User Survey Ratings
USERS user_id | signup_date 10 | 2024-03-20 11 | 2024-04-01 12 | 2024-04-05 SURVEYS survey_id | user_id | sent_at 1 | 10 ...
Analyze Conversation Engagement and Reaction Usage Effectively
messages +-----------+--------+----------+--------------+---------------------+ | messageid | sender | receiver | has_reaction | timestamp |...
Evaluate Impact of Bicycle Deliveries on Efficiency and Costs
Scenario A food-delivery marketplace plans to let couriers (dashers) opt in to deliver by bicycle in addition to cars. Question State the primary busi...
Design A/B Test for Cost-Per-Conversion Efficiency Analysis
Multi-Arm A/B Test: Comparing Cost-Per-Conversion Across Channels Scenario You need to compare four new acquisition channels—YouTube ads, Google Searc...
Analyze Group Call Adoption Using SQL Queries
CALL_LOGS | call_id | user_id | call_start | call_end | is_group_call | participant_cnt | | 101 | 12 | 2023-08-01 10:00...
Generate Synthetic Clickstream Data with Python Function
Scenario The analytics team needs to generate synthetic click-stream records to test a new reporting pipeline before real traffic arrives. Question Wr...
Analyze Causes of Increased Lyft Ride Wait Times
Scenario A ride-hailing marketplace observes a 20% month-over-month increase in rider wait time (time from request to driver arrival). Tasks 1) Root-c...
Design an Experiment to Evaluate New Recommendation Model
Experiment Design: New Ads Ranking Model vs. Current System Context You are evaluating a newly built ML ranking model for an ads recommendation surfac...
Diagnose Discrepancy in A/B Test Conversion Rate Results
A/B Test Design: Personalized Marketing Emails and Conversion Lift Scenario An e-commerce firm wants to send personalized marketing emails to increase...
Analyze User Comment Distribution and Sampling Effects
Scenario You are analyzing daily comment counts per user. The per-user distribution of counts is right-skewed (many zeros/low counts and a long right ...
Measure Billboard Campaign Impact: Design, Bias, Test Strategy
Measuring Billboard Impact on Brand Awareness Scenario A marketing team launched billboard ads in several cities and wants to estimate the campaign's ...
Calculate Total Revenue in USD Using SQL Query
ads_revenue +---------+------------+---------+----------+ | ad_id | country | revenue | currency | +---------+------------+---------+----------+ ...
Identify Top Three Active Users by Event Date
event_log +------------+---------+-----------+---------------------+ | event_date | user_id | event_type| event_timestamp | +------------+--------...
Monitor Friend-Request System for Quality and Abuse
Friendship +--------------+-------------+---------------------+---------------------+ | requester_id | approver_id | request_ts | approval_ts...