PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in SQL for analytical queries, covering aggregation, time-based filtering, joins between transactions and vendor metadata, and the use of ranking/window functions for top-N calculations.

  • medium
  • Zoox
  • Coding & Algorithms
  • Data Engineer

Write Transaction Analytics SQL Queries

Company: Zoox

Role: Data Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given a credit-card transaction dataset with the following tables. `transactions` | column | type | description | |---|---|---| | `transaction_id` | integer | Unique ID for a transaction. | | `user_id` | integer | Unique ID for a customer. | | `vendor_id` | integer | Unique ID for a vendor. | | `transaction_time` | timestamp | Time when the transaction was recorded. | | `transaction_dollars` | numeric | Dollar amount of the transaction. | | `transaction_type` | text | Transaction type, such as `PURCHASE` or `REFUND`. | | `refund_transaction_id` | integer | For a `REFUND`, the `transaction_id` of the original `PURCHASE`. | `vendors` | column | type | description | |---|---|---| | `vendor_id` | integer | Unique ID for a vendor. | | `city` | text | City where the vendor is located. | | `state_province` | text | State or province where the vendor is located. | | `country` | text | Two-letter country code. | Assume that only `PURCHASE` rows count as charged dollars. Write SQL queries for the following tasks: 1. Find the top 3 vendors by total purchase dollars charged in the last 2 years. Return `vendor_id` and `total_dollars_charged`, sorted from highest to lowest total. 2. For each `(state_province, country)` location, find the top 3 vendors by total purchase dollars charged in the last 2 years. Return `state_province`, `country`, `vendor_id`, `total_dollars`, and `vendor_rank`. Use dense ranking so ties receive the same rank.

Quick Answer: This question evaluates proficiency in SQL for analytical queries, covering aggregation, time-based filtering, joins between transactions and vendor metadata, and the use of ranking/window functions for top-N calculations.

Part 1: Top 3 Vendors by Purchase Dollars in the Last 2 Years

You are given transaction rows in a simplified in-memory format. Each transaction is a 4-element list: [vendor_id, transaction_time, transaction_dollars, is_purchase]. - vendor_id: integer vendor identifier - transaction_time: integer day number - transaction_dollars: non-negative integer dollar amount - is_purchase: 1 if this row is a PURCHASE, 0 otherwise (for example, a REFUND) Only PURCHASE rows count as charged dollars. A transaction is considered to be in the last 2 years if it falls in the inclusive window [current_day - 730, current_day]. Return the top 3 vendors by total purchase dollars charged in that window. Each output row should be [vendor_id, total_dollars_charged]. Sort by total_dollars_charged descending. If two vendors have the same total, sort by vendor_id ascending.

Constraints

  • 0 <= len(transactions) <= 200000
  • Each transaction row has exactly 4 integers
  • 0 <= transaction_time, current_day <= 10^9
  • 0 <= transaction_dollars <= 10^9
  • is_purchase is either 0 or 1
  • Use the inclusive 730-day window: current_day - 730 <= transaction_time <= current_day

Examples

Input: ([[10, 900, 100, 1], [20, 850, 200, 1], [10, 880, 50, 1], [30, 100, 999, 1], [20, 920, 30, 0], [40, 950, 120, 1], [50, 960, 110, 1]], 1000)

Expected Output: [[20, 200], [10, 150], [40, 120]]

Explanation: Only purchase rows inside days 270..1000 count. Vendor 20 totals 200, vendor 10 totals 150, vendor 40 totals 120, vendor 50 totals 110.

Input: ([[1, 500, 50, 1], [2, 600, 50, 1], [3, 700, 40, 1], [4, 800, 60, 1]], 1000)

Expected Output: [[4, 60], [1, 50], [2, 50]]

Explanation: Vendors 1 and 2 tie at 50, so the smaller vendor_id comes first.

Input: ([], 1000)

Expected Output: []

Explanation: Edge case: no transactions.

Input: ([[7, 270, 30, 1], [7, 1000, 20, 1], [8, 269, 100, 1]], 1000)

Expected Output: [[7, 50]]

Explanation: Day 270 is included in the window, but day 269 is not.

Hints

  1. Use a hash map keyed by vendor_id to accumulate totals for qualifying purchase rows.
  2. After aggregation, sort vendors by descending total and ascending vendor_id, then take the first 3.

Part 2: Top 3 Vendors by Purchase Dollars for Each Location

You are given transaction rows and vendor locations. Each transaction is a 4-element list: [vendor_id, transaction_time, transaction_dollars, is_purchase]. Vendor locations are given by three parallel arrays: - vendor_ids[i] - states[i] - countries[i] Together, these mean vendor_ids[i] is located in (states[i], countries[i]). Only PURCHASE rows count as charged dollars. A transaction is in the last 2 years if it falls in the inclusive window [current_day - 730, current_day]. For each (state_province, country) location, compute each vendor's total purchase dollars in that window, then assign a dense rank within that location by total dollars descending. Return every vendor whose dense rank is 1, 2, or 3 in its location. Each output row should be: [state_province, country, vendor_id, total_dollars, vendor_rank] Rules: - Dense ranking means equal totals receive the same rank. - If rank 3 is tied, include all vendors with rank 3. - Ignore transactions whose vendor_id does not appear in vendor_ids. - Sort the final result by state_province ascending, then country ascending, then vendor_rank ascending, then vendor_id ascending.

Constraints

  • 0 <= len(transactions) <= 200000
  • 0 <= len(vendor_ids) == len(states) == len(countries) <= 100000
  • vendor_ids are unique
  • Each transaction row has exactly 4 integers
  • 0 <= transaction_time, current_day <= 10^9
  • 0 <= transaction_dollars <= 10^9
  • is_purchase is either 0 or 1
  • Use the inclusive 730-day window: current_day - 730 <= transaction_time <= current_day

Examples

Input: ([[10, 900, 100, 1], [10, 920, 50, 1], [11, 910, 200, 1], [12, 930, 200, 1], [13, 940, 80, 1], [14, 950, 60, 1], [15, 960, 70, 1], [11, 970, 20, 0]], [10, 11, 12, 13, 14, 15], ['CA', 'CA', 'CA', 'NY', 'NY', 'ON'], ['US', 'US', 'US', 'US', 'US', 'CA'], 1000)

Expected Output: [['CA', 'US', 11, 200, 1], ['CA', 'US', 12, 200, 1], ['CA', 'US', 10, 150, 2], ['NY', 'US', 13, 80, 1], ['NY', 'US', 14, 60, 2], ['ON', 'CA', 15, 70, 1]]

Explanation: In CA, US, vendors 11 and 12 tie for rank 1 with 200 each, and vendor 10 is rank 2 with 150.

Input: ([[1, 900, 100, 1], [2, 905, 90, 1], [3, 910, 90, 1], [4, 915, 80, 1], [5, 920, 70, 1]], [1, 2, 3, 4, 5], ['TX', 'TX', 'TX', 'TX', 'TX'], ['US', 'US', 'US', 'US', 'US'], 1000)

Expected Output: [['TX', 'US', 1, 100, 1], ['TX', 'US', 2, 90, 2], ['TX', 'US', 3, 90, 2], ['TX', 'US', 4, 80, 3]]

Explanation: Dense ranks are 1, 2, 2, 3, 4, so the vendor with rank 4 is excluded.

Input: ([], [1, 2], ['CA', 'NY'], ['US', 'US'], 1000)

Expected Output: []

Explanation: Edge case: no transactions.

Input: ([[21, 270, 50, 1], [22, 1000, 60, 1], [24, 900, 100, 1]], [21, 22], ['BC', 'BC'], ['CA', 'US'], 1000)

Expected Output: [['BC', 'CA', 21, 50, 1], ['BC', 'US', 22, 60, 1]]

Explanation: Day 270 is included. Vendor 24 is ignored because its location is unknown.

Hints

  1. First build a lookup from vendor_id to its (state_province, country).
  2. Aggregate totals per vendor, then group vendors by location and assign dense ranks after sorting each group by total descending.
Last updated: May 23, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Compute Roller Coaster Scores - Zoox (medium)
  • Write SQL revenue and anomaly queries - Zoox (medium)
  • Can a Car Meet a Truck? - Zoox (hard)
  • Choose fastest way to transfer 2 TB - Zoox (easy)