PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates mastery of data manipulation and analytics skills in the Data Manipulation (SQL/Python) domain, including SQL aggregation and window functions, time-based joins and grouping, deduplication of status streams, handling late-arriving events, and implementing rolling metrics using both SQL and pandas.

  • Medium
  • Lyft
  • Data Manipulation (SQL/Python)
  • Data Scientist

Query and transform marketplace data in SQL/Python

Company: Lyft

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

Assume today is 2025-09-01. Use the following schema and sample data to answer the questions with both SQL (preferred) and equivalent Python (pandas) pseudocode. Treat timestamps as UTC. Schema - events(event_time TIMESTAMP, user_id INT, event_type ENUM(impression, click, add_to_cart, purchase), city STRING, session_id STRING) - orders(order_id INT, user_id INT, created_at TIMESTAMP, status ENUM(placed, cancelled, completed), status_updated_at TIMESTAMP, city STRING, supplier_id INT) - inventory(supplier_id INT, date DATE, units_available INT, city STRING) Sample tables (tiny, illustrative) events event_time user_id event_type city session_id 2025-08-26 08:01:00 101 impression Austin s1 2025-08-26 08:02:10 101 click Austin s1 2025-08-26 09:15:00 102 impression Austin s2 2025-08-27 10:00:05 103 impression Boston s3 2025-08-27 10:05:10 103 purchase Boston s3 2025-08-28 12:00:00 104 impression Austin s4 2025-08-28 12:03:00 104 add_to_cart Austin s4 2025-08-29 07:45:00 105 impression Austin s5 2025-08-30 14:20:00 101 purchase Austin s6 2025-09-01 09:00:00 106 impression Boston s7 orders order_id user_id created_at status status_updated_at city supplier_id 5001 103 2025-08-27 10:04:59 completed 2025-08-27 10:05:20 Boston 900 5002 101 2025-08-30 14:19:50 placed 2025-08-30 14:19:50 Austin 901 5002 101 2025-08-30 14:19:50 cancelled 2025-08-30 14:25:00 Austin 901 5003 104 2025-08-28 12:02:30 placed 2025-08-28 12:02:30 Austin 902 5003 104 2025-08-28 12:02:30 completed 2025-08-28 12:10:00 Austin 902 5004 105 2025-08-29 07:50:00 placed 2025-08-29 07:50:00 Austin 901 5004 105 2025-08-29 07:50:00 cancelled 2025-08-29 08:05:00 Austin 901 inventory supplier_id date units_available city 900 2025-08-27 5 Boston 901 2025-08-29 1 Austin 901 2025-08-30 3 Austin 902 2025-08-28 2 Austin Tasks 1) City-day funnel: For each city and each date between 2025-08-26 and 2025-09-01 inclusive, compute: (i) unique users with at least one impression; (ii) completed orders (using the latest status per order_id as of its max status_updated_at); (iii) conversion rate = completed_orders / unique_impression_users; and (iv) a 7-day rolling conversion rate per city (window ending on that date). 2) Supplier-day fill rate: For each supplier and date in the same window, compute demand = count of unique order_id whose created_at falls on that date and whose final status (latest by status_updated_at) is not cancelled. Then compute fill_rate = LEAST(demand, units_available) / NULLIF(demand, 0). Return supplier_id, date, demand, units_available, fill_rate; default units_available to 0 when missing. 3) Late completions: Ensure that if an order is created on 2025-08-28 but completes on 2025-08-29, it contributes to demand on 2025-08-28 and to completed_orders on the date of completion only if you are computing completion-by-day; explain which definition you used for Task 1 and why. 4) Provide SQL that is robust to duplicated rows in orders (multiple status lines per order_id) by selecting only the latest status per order_id via window functions. Then sketch equivalent pandas code, including how you would compute the rolling metric and handle division-by-zero safely.

Quick Answer: This question evaluates mastery of data manipulation and analytics skills in the Data Manipulation (SQL/Python) domain, including SQL aggregation and window functions, time-based joins and grouping, deduplication of status streams, handling late-arriving events, and implementing rolling metrics using both SQL and pandas.

Last updated: Mar 29, 2026

Related Coding Questions

  • Print the K-th non-empty line - Lyft (Medium)

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.