PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates proficiency in data manipulation and analytics using pandas, specifically the ability to join relational tables and apply groupby aggregations to compute business metrics like total sales, average sales per order, and completion rates.

  • easy
  • Uber
  • Coding & Algorithms
  • Machine Learning Engineer

Analyze sales with groupby

Company: Uber

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

You are given three pandas DataFrames representing a food delivery marketplace: - `orders(order_id, product_id, quantity, unit_price, status)` - `products(product_id, category_id)` - `categories(category_id, category_name)` Write code to join the tables and compute category-level business metrics using `groupby` and aggregation functions. For each `category_name`, return: 1. `total_sales`: the sum of `quantity * unit_price` 2. `average_sales_per_order`: the mean sales amount per order line 3. `completion_rate`: the fraction of rows where `status == "completed"` The result should be a DataFrame with one row per category.

Quick Answer: This question evaluates proficiency in data manipulation and analytics using pandas, specifically the ability to join relational tables and apply groupby aggregations to compute business metrics like total sales, average sales per order, and completion rates.

You are given data equivalent to three pandas DataFrames representing a food delivery marketplace. For automated testing, each table is passed as a list of dictionaries with the same columns: - orders(order_id, product_id, quantity, unit_price, status) - products(product_id, category_id) - categories(category_id, category_name) Join the tables using inner-join semantics: - orders.product_id = products.product_id - products.category_id = categories.category_id For each category_name that has at least one matched order row, compute: 1. total_sales: the sum of quantity * unit_price 2. average_sales_per_order: the mean sales amount per order row 3. completion_rate: the fraction of rows where status == 'completed' Return the final result as a list of dictionaries sorted by category_name in ascending order. This list of records is the row-wise equivalent of the requested DataFrame. Important: if the same order_id appears multiple times, treat each row as a separate order line when computing averages and completion rate.

Constraints

  • 0 <= len(orders), len(products), len(categories) <= 100000
  • quantity is a non-negative integer and unit_price is a non-negative number
  • Use inner-join semantics: unmatched orders or products are ignored
  • Only categories with at least one matched order row should appear in the output

Examples

Input: ([{'order_id': 1, 'product_id': 101, 'quantity': 2, 'unit_price': 10.0, 'status': 'completed'}, {'order_id': 2, 'product_id': 102, 'quantity': 1, 'unit_price': 20.0, 'status': 'pending'}, {'order_id': 3, 'product_id': 103, 'quantity': 3, 'unit_price': 5.0, 'status': 'completed'}, {'order_id': 4, 'product_id': 101, 'quantity': 1, 'unit_price': 10.0, 'status': 'cancelled'}], [{'product_id': 101, 'category_id': 1}, {'product_id': 102, 'category_id': 2}, {'product_id': 103, 'category_id': 1}], [{'category_id': 1, 'category_name': 'Fast Food'}, {'category_id': 2, 'category_name': 'Drinks'}])

Expected Output: [{'category_name': 'Drinks', 'total_sales': 20.0, 'average_sales_per_order': 20.0, 'completion_rate': 0.0}, {'category_name': 'Fast Food', 'total_sales': 45.0, 'average_sales_per_order': 15.0, 'completion_rate': 0.6666666666666666}]

Explanation: Fast Food has order-line sales of 20, 15, and 10, so total_sales = 45 and average_sales_per_order = 15. Two of its three rows are completed, so completion_rate = 2/3. Drinks has one non-completed row with sales 20.

Input: ([], [{'product_id': 1, 'category_id': 10}], [{'category_id': 10, 'category_name': 'Pizza'}])

Expected Output: []

Explanation: There are no order rows to aggregate, so the result is empty.

Input: ([{'order_id': 10, 'product_id': 201, 'quantity': 1, 'unit_price': 3.5, 'status': 'completed'}, {'order_id': 11, 'product_id': 202, 'quantity': 2, 'unit_price': 4.0, 'status': 'completed'}, {'order_id': 12, 'product_id': 203, 'quantity': 1, 'unit_price': 10.0, 'status': 'failed'}], [{'product_id': 201, 'category_id': 1}, {'product_id': 202, 'category_id': 1}, {'product_id': 203, 'category_id': 2}], [{'category_id': 1, 'category_name': 'Desserts'}, {'category_id': 2, 'category_name': 'Sushi'}])

Expected Output: [{'category_name': 'Desserts', 'total_sales': 11.5, 'average_sales_per_order': 5.75, 'completion_rate': 1.0}, {'category_name': 'Sushi', 'total_sales': 10.0, 'average_sales_per_order': 10.0, 'completion_rate': 0.0}]

Explanation: Desserts contains two completed rows with sales 3.5 and 8.0, so total is 11.5 and the average is 5.75. Sushi has one failed row with sales 10.0, so its completion rate is 0.0.

Input: ([{'order_id': 1, 'product_id': 301, 'quantity': 1, 'unit_price': 12.0, 'status': 'completed'}, {'order_id': 1, 'product_id': 302, 'quantity': 2, 'unit_price': 3.0, 'status': 'completed'}, {'order_id': 2, 'product_id': 302, 'quantity': 1, 'unit_price': 3.0, 'status': 'pending'}], [{'product_id': 301, 'category_id': 1}, {'product_id': 302, 'category_id': 1}], [{'category_id': 1, 'category_name': 'Snacks'}, {'category_id': 2, 'category_name': 'Unused'}])

Expected Output: [{'category_name': 'Snacks', 'total_sales': 21.0, 'average_sales_per_order': 7.0, 'completion_rate': 0.6666666666666666}]

Explanation: All matched rows belong to Snacks. Even though order_id 1 appears twice, both lines are counted separately. Sales are 12.0, 6.0, and 3.0, so total is 21.0 and average is 7.0. Two of the three rows are completed.

Hints

  1. Build lookup maps such as product_id -> category_id and category_id -> category_name so you can simulate the joins efficiently.
  2. While scanning the orders, keep per-category aggregates: total sales, number of rows, and number of completed rows.
Last updated: Apr 19, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement Minesweeper and Word Search - Uber (medium)
  • Implement Store Autocomplete - Uber (medium)
  • Implement Cache Eviction And Seat Assignment - Uber (medium)
  • Schedule Non-Overlapping Meetings Efficiently - Uber (hard)
  • Evaluate an Arithmetic Expression - Uber