PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates ability to perform time-based aggregations and data manipulation using SQL and pandas, including per-order revenue calculation, same-day de-duplication, rolling-window sums, and intra-customer ranking.

  • Medium
  • Amazon
  • Data Manipulation (SQL/Python)
  • Data Scientist

Design SQL/Pandas aggregations on retail schema

Company: Amazon

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Using the schema and sample data below, answer both parts. Assume today is 2025-09-01. Use standard SQL (e.g., PostgreSQL) and idiomatic pandas without Python for-loops over rows. Schema and sample rows customers +---------+------+------------+ | cust_id | name | signup_date| +---------+------+------------+ | C1 | Ada | 2025-08-28 | | C2 | Ben | 2025-08-29 | | C3 | Cy | 2025-08-25 | +---------+------+------------+ orders +----------+---------+------------+-----------+ | order_id | cust_id | order_date | status | +----------+---------+------------+-----------+ | 101 | C1 | 2025-08-29 | completed | | 102 | C1 | 2025-08-30 | completed | | 107 | C1 | 2025-08-30 | completed | | 103 | C1 | 2025-09-01 | returned | | 104 | C2 | 2025-08-31 | completed | | 105 | C2 | 2025-09-01 | completed | | 106 | C3 | 2025-08-25 | cancelled | +----------+---------+------------+-----------+ order_items +----------+------------+-----+--------+ | order_id | product_id | qty | price | +----------+------------+-----+--------+ | 101 | P1 | 1 | 10.00 | | 101 | P2 | 2 | 5.00 | | 102 | P1 | 1 | 10.00 | | 107 | P2 | 1 | 5.00 | | 104 | P3 | 1 | 20.00 | | 105 | P3 | 2 | 20.00 | +----------+------------+-----+--------+ products +------------+----------+ | product_id | category | +------------+----------+ | P1 | A | | P2 | B | | P3 | A | +------------+----------+ Definitions - Order revenue = sum(qty * price) over items for that order. - Only orders with status = completed count toward revenue; returned or cancelled do not. - If a customer has multiple completed orders on the same calendar day, keep only one order for that day: pick the order with the greater order revenue; if there is a tie on revenue, keep the higher order_id. Part A (SQL) Write a single SQL query that returns, for each customer with at least two completed orders in the last 7 days (inclusive of today = 2025-09-01), one row per kept order (after the same-day de-duplication rule) with the following columns: - cust_id - order_id (after applying the same-day selection rule) - order_date - order_revenue - rolling_3_day_rev: sum of the customer's order_revenue over the window [order_date - 2 days, order_date], considering only completed orders that survived the same-day selection rule - rank_in_7d_by_revenue: dense rank of this kept order's revenue among the customer's kept orders in the last 7 days, highest revenue gets rank 1 Requirements: implement same-day selection without correlated subqueries; use window functions; do not use temporary tables. Part B (pandas) Using pandas DataFrames with the same content, produce a DataFrame with one row per customer for the last 7 days (inclusive, relative to today = 2025-09-01) containing: - cust_id - total_7d_revenue - top_category_7d: the product category with the highest revenue for that customer in the 7-day window (break ties by alphabetical order of category) - top_category_share_7d: the fraction (0–1) of the customer's 7-day revenue attributable to top_category_7d Constraints: avoid Python loops; show how you enforce the same-day order selection rule prior to aggregation.

Quick Answer: This question evaluates ability to perform time-based aggregations and data manipulation using SQL and pandas, including per-order revenue calculation, same-day de-duplication, rolling-window sums, and intra-customer ranking.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Find recommended friend pairs by shared songs - Amazon (medium)
  • Find recommended friend pairs by shared listening - Amazon (easy)
  • Write SQL window functions for D7 retention - Amazon (medium)
  • Find daily first-order merchants with SQL - Amazon (Medium)
  • Design student–course data models and SQL - Amazon (Medium)