PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates proficiency in SQL and Python for time‑series data manipulation and experimental analysis, covering window functions (7‑day percentile_cont medians), date spines and timezone‑aware timestamp truncation, conversion-rate aggregation, difference‑in‑differences uplift estimation, and CUPED adjustment with cluster‑robust standard errors. It is commonly asked to assess practical data engineering and applied statistics skills in the Data Manipulation (SQL/Python) domain, testing applied implementation and causal inference ability rather than only conceptual understanding.

  • Medium
  • Uber
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute ETA shift and conversion uplift

Company: Uber

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Use PostgreSQL (SQL) and brief Python pseudocode. Assume 'today' is 2025-09-01. Schema: - trips(trip_id BIGINT, request_ts TIMESTAMP, city_id INT, rider_id BIGINT, driver_id BIGINT, shown_eta_sec INT, pickup_eta_sec INT, is_completed BOOLEAN, is_canceled BOOLEAN, surge_multiplier NUMERIC, is_treatment BOOLEAN, experiment_id INT) - riders(rider_id BIGINT, signup_dt DATE, city_id INT) - city_dim(city_id INT, city_name TEXT, tier TEXT) - incentives(rider_id BIGINT, offer_start_dt DATE, offer_end_dt DATE, percent_off INT, cap_usd NUMERIC) - day_weather(city_id INT, dt DATE, precip_mm NUMERIC, temp_c NUMERIC) Small ASCII samples: trips trip_id | request_ts | city_id | rider_id | driver_id | shown_eta_sec | pickup_eta_sec | is_completed | is_canceled | surge_multiplier | is_treatment | experiment_id 1 | 2025-08-01 08:01:00 | 1 | 101 | 9001 | 420 | 480 | t | f | 1.0 | t | 42 2 | 2025-08-01 08:03:00 | 1 | 102 | 9002 | 360 | 360 | f | t | 1.2 | f | 42 3 | 2025-08-15 18:30:00 | 2 | 103 | 9003 | 300 | 420 | t | f | 1.1 | f | 42 4 | 2025-08-20 09:00:00 | 1 | 101 | 9001 | 330 | 360 | t | f | 1.0 | t | 42 5 | 2025-08-28 22:10:00 | 2 | 104 | 9004 | 600 | 660 | f | t | 1.5 | t | 42 riders rider_id | signup_dt | city_id 101 | 2025-05-01 | 1 102 | 2025-01-15 | 1 103 | 2025-06-20 | 2 104 | 2025-03-10 | 2 city_dim city_id | city_name | tier 1 | Alpha | T1 2 | Beta | T2 incentives rider_id | offer_start_dt | offer_end_dt | percent_off | cap_usd 101 | 2025-08-15 | 2025-08-31 | 20 | 10 Tasks: 1) SQL: For experiment_id=42, compute for each city_id and calendar date the 7-day rolling median of shown_eta_sec and the request→trip conversion rate (completed/requests) from 2025-08-01 to 2025-09-01. Include days with zero requests (show conversion as NULL) by generating a date spine. Restrict to riders with signup_dt < 2025-07-01. Use percentile_cont(0.5) OVER a 7-day window for the median; clearly handle time zones by truncating request_ts at UTC midnight. 2) SQL: At city level, estimate a difference‑in‑differences conversion uplift between treated (is_treatment=true) and control for post=2025-08-15..2025-09-01 vs pre=2025-08-01..2025-08-14. Output: city_id, pre_treat_conv, pre_ctrl_conv, post_treat_conv, post_ctrl_conv, did_uplift = (post_treat_conv - pre_treat_conv) - (post_ctrl_conv - pre_ctrl_conv). Ensure riders with mixed treatment exposure are counted per their trip‑level exposure. 3) Python (pseudocode ok): Compute CUPED‑adjusted conversion at the rider_id level with X = pre‑period conversion and Y = post‑period conversion; estimate theta = cov(Y, X)/var(X), compute Y_cuped = Y - theta*(X - E[X]). Then estimate adjusted uplift between treated and control, with its standard error via cluster‑robust SEs at rider_id or city_id.

Quick Answer: This question evaluates proficiency in SQL and Python for time‑series data manipulation and experimental analysis, covering window functions (7‑day percentile_cont medians), date spines and timezone‑aware timestamp truncation, conversion-rate aggregation, difference‑in‑differences uplift estimation, and CUPED adjustment with cluster‑robust standard errors. It is commonly asked to assess practical data engineering and applied statistics skills in the Data Manipulation (SQL/Python) domain, testing applied implementation and causal inference ability rather than only conceptual understanding.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Transform DataFrame and compute diff-in-diff - Uber (easy)
  • Write SQL for active counts and YTD top driver - Uber (Medium)
  • Write SQL and Pandas for Uber Trips - Uber (Medium)
  • Write SQL/Python for CTR analytics - Uber (Medium)
  • Clean, split, merge, and aggregate with pandas - Uber (Medium)