PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates proficiency with pandas data manipulation and cleaning, including handling missing values, trimming and splitting strings, numeric type conversion, merges, time-aware ordering, and aggregation for per-entity summaries.

  • Medium
  • Uber
  • Data Manipulation (SQL/Python)
  • Data Scientist

Clean, split, merge, and aggregate with pandas

Company: Uber

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Given two CSVs, use pandas to clean, split strings, merge, and aggregate. drivers.csv driver_id,name,signup_city D1,Jane Doe,SF D2,Mark S,NYC D3,Adam L,LA trips.csv trip_id,driver_id,ts_utc,route,fare_usd T1,D1,2019-01-01T08:00:00Z,SF-CA|SFO,17.52 T2,D1,2019-01-02T09:00:00Z,SF-CA|DAL,4.40 T3,D2,2019-01-02T10:00:00Z,, T4,D3,2019-01-03T11:00:00Z,LA-CA|LAX,8.00 Tasks: 1) Load both files into DataFrames; show head(2) and tail(1) of trips to verify ingest. 2) Drop rows in trips with missing fare_usd or missing/empty route (after stripping whitespace). Ensure fare_usd is numeric. 3) Split the route column on '|' into origin and destination columns; trim whitespace. If the split yields fewer than 2 tokens, drop those rows. 4) Merge the cleaned trips with drivers on driver_id (left join from trips) and keep only rows with a matching driver. 5) Produce a per-driver summary with columns: driver_id, name, trips_count, avg_fare_usd (rounded to 2 decimals), last_3_trips_avg (average of the last 3 trips per driver ordered by ts_utc; if <3 trips, average over available). 6) Return the top 2 drivers by avg_fare_usd DESC (break ties by trips_count DESC, then name ASC) and print the final DataFrame schema (dtypes) to confirm transformations. Explicitly use: head, tail, dropna, str.split, merge, groupby, sort_values, and rolling/agg as appropriate.

Quick Answer: This question evaluates proficiency with pandas data manipulation and cleaning, including handling missing values, trimming and splitting strings, numeric type conversion, merges, time-aware ordering, and aggregation for per-entity summaries.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Transform DataFrame and compute diff-in-diff - Uber (easy)
  • Write SQL for active counts and YTD top driver - Uber (Medium)
  • Write SQL and Pandas for Uber Trips - Uber (Medium)
  • Compute ETA shift and conversion uplift - Uber (Medium)
  • Write SQL/Python for CTR analytics - Uber (Medium)