What does the Uber Data Scientist interview process look like?

Based on candidate reports compiled in this guide, the Uber Data Scientist loop typically includes 2 stages: Technical Screen, Onsite. Each stage covers a distinct set of topics walked through in detail above.

What topics does Uber focus on in Data Scientist interviews?

Uber Data Scientist interviews cover Analytics & Experimentation, Statistics & Math, Data Manipulation (SQL/Python), Machine Learning, Behavioral & Leadership. The guide above breaks each topic down into core concepts, worked examples, and the real questions candidates were asked.

How many real Uber Data Scientist interview questions are in this guide?

This guide is anchored to 30 real Uber Data Scientist interview questions sourced from candidate reports, each linked to a full practice page with starter code, solution discussion, and community comments.

Uber Data Scientist Interview Prep Guide

Everything Uber actually asks Data Scientist candidates — concept walkthroughs, worked examples, and the real interview questions, drawn from candidate reports. Free to read.

Uber Data Scientist Interview Cheatsheet cover

Technical Screen

Analytics & Experimentation

ETA Evaluation And Prediction — covered in depth under Onsite below.
A/B Testing And Experiment Design — covered in depth under Onsite below.
Switchback Experiments And Marketplace Interference — covered in depth under Onsite below.
Product Metrics And Marketplace Diagnostics — covered in depth under Onsite below.

Statistics & Math

Power Analysis And Statistical Inference — covered in depth under Onsite below.
Causal Inference And Identification — covered in depth under Onsite below.

Data Manipulation (SQL/Python)

SQL Window Functions And Analytics

Top-to-bottom decision flowchart for choosing SQL windowing patterns: dedupe with ROW_NUMBER, rolling metrics with OVER ... ROWS, top-N ranking choices, cohort/CTR dedupe and join windows; date spine and timezone notes in footer.

What's being tested

Tests SQL windowing for product analytics: cohort metrics, rolling time-series summaries, deduped event funnels, top-N segmentation, and experiment readouts. Uber DS interviews probe whether you can turn messy trip/user/event tables into defensible metrics without double-counting users, leaking future data, or mishandling time boundaries.

Patterns & templates

Last/first event per entity — ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_ts DESC, event_id DESC); filter rn = 1 for deterministic deduping.
Rolling metrics — AVG(metric) OVER (PARTITION BY city ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW); use ROWS for fixed row counts.
Rolling percentiles — PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY eta) over a 7-day frame; check warehouse support and NULL handling.
Cohort conversion / CTR — define denominator once, dedupe exposures/clicks with COUNT(DISTINCT user_id), and join within explicit windows like click_ts <= impression_ts + INTERVAL '48 hours'.
Top-N ranking — RANK, DENSE_RANK, or ROW_NUMBER depending on tie behavior; always state whether tied promos/drivers/users should both appear.
Date spine joins — generate all dates, left join events, COALESCE missing counts to zero; needed for rolling averages and anomaly detection.
Timezone-aware truncation — convert to local market time before DATE_TRUNC; SF January metrics should not use raw UTC day boundaries.

Common pitfalls

Pitfall: Using COUNT(*) after joining impressions to clicks inflates CTR when users click multiple times; dedupe at the user-impression grain first.

Pitfall: Computing rolling conversion with future rows, e.g. ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING, leaks information into historical metrics.

Pitfall: Treating RANK and ROW_NUMBER as interchangeable causes silent tie bugs in top-N promotion or marketplace behavior analyses.

Practice these

The practice cards below cover the canonical variants — solve all of them and time yourself.

Practice questions

Uber

Medium

Data Scientist

Calculate January-2024 SF Promotion Impact Using SQL Queries

Evaluates proficiency in relational data manipulation techniques—specifically joins, date and region filtering, aggregation, top‑N identification, and...

Uber Data Scientist Interview Prep Guide

Technical Screen

Analytics & Experimentation

Statistics & Math

Data Manipulation (SQL/Python)

What's being tested

Patterns & templates

Common pitfalls

Practice these

Calculate January-2024 SF Promotion Impact Using SQL Queries

Write SQL for active counts and YTD top driver

Analyze User Purchase Behavior in Online Marketplace Data

Machine Learning

Behavioral & Leadership

Onsite

Analytics & Experimentation

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Improve Estimated Time of Arrival for Uber Riders

Measure Impact of Updated Rider ETA Algorithm

Evaluate ETA Impact on Conversion

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Design an Uber A/B experiment end-to-end

Explain and validate A/B test assumptions

Evaluate Promotion Campaign Effectiveness with A/B Testing

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Design an experiment with marketplace network effects

Design a switchback and choose block length

Measure feature impact with switchback, PSM, and CACE

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Define market-only rider experience metrics

Define ride success metric for Uber

Evaluate Rider-Incentive Program Impact with Key Metrics

Statistics & Math

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Analyze results and large p-values correctly

Formulate hypotheses and compute AB test significance

Evaluate Email Subject Line Performance Using Hypotheses

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Measure rider incentive causal ROI

Apply instrumental variables under interference

Should Uber double member discounts?

What's being tested

Core knowledge

Worked example