PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/PayPal

Write SQL for top drivers and cancellation rates

Last updated: Jun 15, 2026

Quick Overview

This onsite Data Scientist SQL question (data manipulation) uses a rideshare/airport-pickup dataset to test joins with correct ON-vs-WHERE semantics, time-window filtering, conditional aggregation for counts and rates over a shared denominator, null and divide-by-zero handling, and deterministic tie-breaking. Candidates rank top drivers by completed airport trips, compute per-airport cancellation rates split by who canceled, and rank top riders by cancellations.

  • easy
  • PayPal
  • Data Manipulation (SQL/Python)
  • Data Scientist

Write SQL for top drivers and cancellation rates

Company: PayPal

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Onsite

##### Question You work on a rideshare / ride-hailing product that connects **drivers** and **riders**, with a focus on **airport pickups**. Using SQL, answer the questions below. Assume all timestamps are stored in UTC and the analysis window is the **last 28 days ending at `:as_of_ts` (inclusive)** — older variants of this question use a rolling **last 30 days**, so use whichever window your interviewer specifies. State any additional assumptions you make. ### Tables **drivers** - `driver_id` BIGINT (PK) - `home_city_id` BIGINT - `is_active` BOOLEAN - `activated_at` TIMESTAMPTZ **riders** - `rider_id` BIGINT (PK) - `is_active` BOOLEAN - `is_fraud` BOOLEAN - `first_trip_at` TIMESTAMPTZ **airports** - `airport_id` BIGINT (PK) - `airport_code` TEXT — e.g., `'SFO'` - `country_code` TEXT — e.g., `'US'` **trips** (one row per trip request) - `trip_id` BIGINT (PK) - `driver_id` BIGINT (nullable until dispatch) - `rider_id` BIGINT - `request_at` TIMESTAMPTZ - `airport_id` BIGINT (nullable; non-null means airport pickup) - `status` TEXT — one of `('completed','canceled','expired')` - `canceled_by` TEXT — nullable; if canceled, one of `('driver','rider','system')` - `canceled_at` TIMESTAMPTZ (nullable) ### Q1: Top drivers by airport completions Return the **top 10 drivers** by number of **completed airport-pickup trips** in the window, restricted to: - airports in the US (`airports.country_code = 'US'`) - active drivers only (`drivers.is_active = TRUE`) - exclude fraudulent riders (rows where `riders.is_fraud = TRUE`) Output columns: - `driver_id` - `completed_trips` - `total_airport_requests` — count of all airport trip requests, regardless of status - `completion_rate` = `completed_trips / total_airport_requests` Tie-breakers (in order): higher `completed_trips`, then higher `completion_rate`, then lower `driver_id`. ### Q2: Cancellation rate by airport and canceler For the same window and US-airport restriction, compute cancellation rates **by airport**, split by who canceled. For each `airport_code`, return: - `total_requests` - `driver_cancel_rate` = driver-canceled trips / total_requests - `rider_cancel_rate` = rider-canceled trips / total_requests - `system_cancel_rate` = system-canceled trips / total_requests - `overall_cancel_rate` = all canceled trips / total_requests Include only airports with `total_requests >= 100`. ### Q3: Top riders by cancellations Return the **top 10 riders** with the highest number of **canceled** airport-pickup trips in the window (US airports). Output columns: - `rider_id` - `canceled_trips` > Note: Many candidates make mistakes around filtering joined tables (conditions placed in `WHERE` vs `ON`). Write queries that handle nulls and outer joins correctly, and that preserve airports/drivers with requests even when some joined attributes are missing. Be explicit about which filters belong in `ON` versus `WHERE`.

Quick Answer: This onsite Data Scientist SQL question (data manipulation) uses a rideshare/airport-pickup dataset to test joins with correct ON-vs-WHERE semantics, time-window filtering, conditional aggregation for counts and rates over a shared denominator, null and divide-by-zero handling, and deterministic tie-breaking. Candidates rank top drivers by completed airport trips, compute per-airport cancellation rates split by who canceled, and rank top riders by cancellations.

Related Interview Questions

  • Write SQL using HAVING and window functions - PayPal (easy)
  • Write SQL to flag Venmo ATO - PayPal (Medium)
  • Write conditional aggregation SQL queries - PayPal (Medium)
  • Calculate and Find Average Contacts and Sync Percentage - PayPal (Medium)
  • Analyze Transactions and Classify by Amount in SQL - PayPal (Medium)
PayPal logo
PayPal
Dec 7, 2025, 12:00 AM
Data Scientist
Onsite
Data Manipulation (SQL/Python)
5
0
Question

You work on a rideshare / ride-hailing product that connects drivers and riders, with a focus on airport pickups. Using SQL, answer the questions below.

Assume all timestamps are stored in UTC and the analysis window is the last 28 days ending at :as_of_ts (inclusive) — older variants of this question use a rolling last 30 days, so use whichever window your interviewer specifies. State any additional assumptions you make.

Tables

drivers

  • driver_id BIGINT (PK)
  • home_city_id BIGINT
  • is_active BOOLEAN
  • activated_at TIMESTAMPTZ

riders

  • rider_id BIGINT (PK)
  • is_active BOOLEAN
  • is_fraud BOOLEAN
  • first_trip_at TIMESTAMPTZ

airports

  • airport_id BIGINT (PK)
  • airport_code TEXT — e.g., 'SFO'
  • country_code TEXT — e.g., 'US'

trips (one row per trip request)

  • trip_id BIGINT (PK)
  • driver_id BIGINT (nullable until dispatch)
  • rider_id BIGINT
  • request_at TIMESTAMPTZ
  • airport_id BIGINT (nullable; non-null means airport pickup)
  • status TEXT — one of ('completed','canceled','expired')
  • canceled_by TEXT — nullable; if canceled, one of ('driver','rider','system')
  • canceled_at TIMESTAMPTZ (nullable)

Q1: Top drivers by airport completions

Return the top 10 drivers by number of completed airport-pickup trips in the window, restricted to:

  • airports in the US ( airports.country_code = 'US' )
  • active drivers only ( drivers.is_active = TRUE )
  • exclude fraudulent riders (rows where riders.is_fraud = TRUE )

Output columns:

  • driver_id
  • completed_trips
  • total_airport_requests — count of all airport trip requests, regardless of status
  • completion_rate = completed_trips / total_airport_requests

Tie-breakers (in order): higher completed_trips, then higher completion_rate, then lower driver_id.

Q2: Cancellation rate by airport and canceler

For the same window and US-airport restriction, compute cancellation rates by airport, split by who canceled. For each airport_code, return:

  • total_requests
  • driver_cancel_rate = driver-canceled trips / total_requests
  • rider_cancel_rate = rider-canceled trips / total_requests
  • system_cancel_rate = system-canceled trips / total_requests
  • overall_cancel_rate = all canceled trips / total_requests

Include only airports with total_requests >= 100.

Q3: Top riders by cancellations

Return the top 10 riders with the highest number of canceled airport-pickup trips in the window (US airports).

Output columns:

  • rider_id
  • canceled_trips

Note: Many candidates make mistakes around filtering joined tables (conditions placed in WHERE vs ON). Write queries that handle nulls and outer joins correctly, and that preserve airports/drivers with requests even when some joined attributes are missing. Be explicit about which filters belong in ON versus WHERE.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More PayPal•More Data Scientist•PayPal Data Scientist•PayPal Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.