How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a easy difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Uber.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Uber during technical interviews.

Analyze sales with groupby | Uber Coding Question

Quick Overview

This question evaluates proficiency in data manipulation and analytics using pandas, specifically the ability to join relational tables and apply groupby aggregations to compute business metrics like total sales, average sales per order, and completion rates.

Analyze sales with groupby

Company: Uber

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

You are given three pandas DataFrames representing a food delivery marketplace: - `orders(order_id, product_id, quantity, unit_price, status)` - `products(product_id, category_id)` - `categories(category_id, category_name)` Write code to join the tables and compute category-level business metrics using `groupby` and aggregation functions. For each `category_name`, return: 1. `total_sales`: the sum of `quantity * unit_price` 2. `average_sales_per_order`: the mean sales amount per order line 3. `completion_rate`: the fraction of rows where `status == "completed"` The result should be a DataFrame with one row per category.

Quick Answer: This question evaluates proficiency in data manipulation and analytics using pandas, specifically the ability to join relational tables and apply groupby aggregations to compute business metrics like total sales, average sales per order, and completion rates.

You are given data equivalent to three pandas DataFrames representing a food delivery marketplace. For automated testing, each table is passed as a list of dictionaries with the same columns: - orders(order_id, product_id, quantity, unit_price, status) - products(product_id, category_id) - categories(category_id, category_name) Join the tables using inner-join semantics: - orders.product_id = products.product_id - products.category_id = categories.category_id For each category_name that has at least one matched order row, compute: 1. total_sales: the sum of quantity * unit_price 2. average_sales_per_order: the mean sales amount per order row 3. completion_rate: the fraction of rows where status == 'completed' Return the final result as a list of dictionaries sorted by category_name in ascending order. This list of records is the row-wise equivalent of the requested DataFrame. Important: if the same order_id appears multiple times, treat each row as a separate order line when computing averages and completion rate.

Constraints

0 <= len(orders), len(products), len(categories) <= 100000
quantity is a non-negative integer and unit_price is a non-negative number
Use inner-join semantics: unmatched orders or products are ignored
Only categories with at least one matched order row should appear in the output

Examples

Input: ([{'order_id': 1, 'product_id': 101, 'quantity': 2, 'unit_price': 10.0, 'status': 'completed'}, {'order_id': 2, 'product_id': 102, 'quantity': 1, 'unit_price': 20.0, 'status': 'pending'}, {'order_id': 3, 'product_id': 103, 'quantity': 3, 'unit_price': 5.0, 'status': 'completed'}, {'order_id': 4, 'product_id': 101, 'quantity': 1, 'unit_price': 10.0, 'status': 'cancelled'}], [{'product_id': 101, 'category_id': 1}, {'product_id': 102, 'category_id': 2}, {'product_id': 103, 'category_id': 1}], [{'category_id': 1, 'category_name': 'Fast Food'}, {'category_id': 2, 'category_name': 'Drinks'}])

Expected Output: [{'category_name': 'Drinks', 'total_sales': 20.0, 'average_sales_per_order': 20.0, 'completion_rate': 0.0}, {'category_name': 'Fast Food', 'total_sales': 45.0, 'average_sales_per_order': 15.0, 'completion_rate': 0.6666666666666666}]

Explanation: Fast Food has order-line sales of 20, 15, and 10, so total_sales = 45 and average_sales_per_order = 15. Two of its three rows are completed, so completion_rate = 2/3. Drinks has one non-completed row with sales 20.

Input: ([], [{'product_id': 1, 'category_id': 10}], [{'category_id': 10, 'category_name': 'Pizza'}])

Expected Output: []

Explanation: There are no order rows to aggregate, so the result is empty.

Input: ([{'order_id': 10, 'product_id': 201, 'quantity': 1, 'unit_price': 3.5, 'status': 'completed'}, {'order_id': 11, 'product_id': 202, 'quantity': 2, 'unit_price': 4.0, 'status': 'completed'}, {'order_id': 12, 'product_id': 203, 'quantity': 1, 'unit_price': 10.0, 'status': 'failed'}], [{'product_id': 201, 'category_id': 1}, {'product_id': 202, 'category_id': 1}, {'product_id': 203, 'category_id': 2}], [{'category_id': 1, 'category_name': 'Desserts'}, {'category_id': 2, 'category_name': 'Sushi'}])

Expected Output: [{'category_name': 'Desserts', 'total_sales': 11.5, 'average_sales_per_order': 5.75, 'completion_rate': 1.0}, {'category_name': 'Sushi', 'total_sales': 10.0, 'average_sales_per_order': 10.0, 'completion_rate': 0.0}]

Explanation: Desserts contains two completed rows with sales 3.5 and 8.0, so total is 11.5 and the average is 5.75. Sushi has one failed row with sales 10.0, so its completion rate is 0.0.

Input: ([{'order_id': 1, 'product_id': 301, 'quantity': 1, 'unit_price': 12.0, 'status': 'completed'}, {'order_id': 1, 'product_id': 302, 'quantity': 2, 'unit_price': 3.0, 'status': 'completed'}, {'order_id': 2, 'product_id': 302, 'quantity': 1, 'unit_price': 3.0, 'status': 'pending'}], [{'product_id': 301, 'category_id': 1}, {'product_id': 302, 'category_id': 1}], [{'category_id': 1, 'category_name': 'Snacks'}, {'category_id': 2, 'category_name': 'Unused'}])

Expected Output: [{'category_name': 'Snacks', 'total_sales': 21.0, 'average_sales_per_order': 7.0, 'completion_rate': 0.6666666666666666}]

Explanation: All matched rows belong to Snacks. Even though order_id 1 appears twice, both lines are counted separately. Sales are 12.0, 6.0, and 3.0, so total is 21.0 and average is 7.0. Two of the three rows are completed.

Hints

Build lookup maps such as product_id -> category_id and category_id -> category_name so you can simulate the joins efficiently.
While scanning the orders, keep per-category aggregates: total sales, number of rows, and number of completed rows.

Quick Overview