SQL Window Functions And Analytics
Asked of: Data Scientist
Last updated

What's being tested
Tests SQL windowing for product analytics: cohort metrics, rolling time-series summaries, deduped event funnels, top-N segmentation, and experiment readouts. Uber DS interviews probe whether you can turn messy trip/user/event tables into defensible metrics without double-counting users, leaking future data, or mishandling time boundaries.
Patterns & templates
-
Last/first event per entity —
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_ts DESC, event_id DESC); filterrn = 1for deterministic deduping. -
Rolling metrics —
AVG(metric) OVER (PARTITION BY city ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW); useROWSfor fixed row counts. -
Rolling percentiles —
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY eta)over a 7-day frame; check warehouse support and NULL handling. -
Cohort conversion / CTR — define denominator once, dedupe exposures/clicks with
COUNT(DISTINCT user_id), and join within explicit windows likeclick_ts <= impression_ts + INTERVAL '48 hours'. -
Top-N ranking —
RANK,DENSE_RANK, orROW_NUMBERdepending on tie behavior; always state whether tied promos/drivers/users should both appear. -
Date spine joins — generate all dates, left join events,
COALESCEmissing counts to zero; needed for rolling averages and anomaly detection. -
Timezone-aware truncation — convert to local market time before
DATE_TRUNC;SFJanuary metrics should not use raw UTC day boundaries.
Common pitfalls
Pitfall: Using
COUNT(*)after joining impressions to clicks inflatesCTRwhen users click multiple times; dedupe at the user-impression grain first.
Pitfall: Computing rolling conversion with future rows, e.g.
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING, leaks information into historical metrics.
Pitfall: Treating
RANKandROW_NUMBERas interchangeable causes silent tie bugs in top-N promotion or marketplace behavior analyses.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Practice questions
- Write SQL for active counts and YTD top driverUber · Data Scientist · Technical Screen · Medium
- Write SQL and Pandas for Uber TripsUber · Data Scientist · Technical Screen · Medium
- Compute ETA shift and conversion upliftUber · Data Scientist · Technical Screen · Medium
- Write SQL/Python for CTR analyticsUber · Data Scientist · Technical Screen · Medium
- Write SQL for fares and age-band countsUber · Data Scientist · Technical Screen · Medium
- Analyze User Purchase Behavior in Online Marketplace DataUber · Data Scientist · Technical Screen · Medium
- Calculate January-2024 SF Promotion Impact Using SQL QueriesUber · Data Scientist · Technical Screen · Medium
Related concepts
- SQL Window Functions And Analytical QueryingData Manipulation (SQL/Python)
- SQL Analytics Joins, Aggregations, And Windows
- SQL AnalyticsData Manipulation (SQL/Python)
- SQL Window Functions And Temporal JoinsData Manipulation (SQL/Python)
- SQL Analytical QueryingData Manipulation (SQL/Python)
- SQL Analytical Querying And Data ModelingData Manipulation (SQL/Python)