Write monthly customer and sales SQL queries
Company: TikTok
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You are analyzing a food-delivery marketplace.
## Tables
Assume the following schema (you may add minor helper CTEs as needed):
### `orders`
- `order_id` (BIGINT, PK)
- `customer_id` (BIGINT)
- `restaurant_id` (BIGINT)
- `order_ts` (TIMESTAMP) — time the order was placed
- `order_amount` (NUMERIC) — total GMV for the order
### `customers`
- `customer_id` (BIGINT, PK)
### `restaurants`
- `restaurant_id` (BIGINT, PK)
## Conventions / Definitions
- “Month” means calendar month based on `order_ts` (assume UTC unless otherwise stated).
- “Monthly order count” for a customer is the number of orders they placed in that month.
- A “high-frequency customer” in a given month is a customer with **monthly order count > 30**.
---
## Q1) Percentage of high-frequency customers each month
For each month, compute the percentage of distinct customers who are high-frequency customers.
**Output:**
- `month` (DATE or TIMESTAMP truncated to month)
- `pct_high_frequency_customers`
---
## Q2) Most-ordering customer each month excluding high-frequency customers
For each month, find the customer(s) with the **highest** monthly order count among customers with **monthly order count ≤ 30**.
**Output:**
- `month`
- `customer_id`
- `monthly_order_count`
**Follow-up:** Identify the single customer (or customers, if tied) with the most total orders **across all months** (no exclusion).
---
## Q3) Month-over-month sales change for a restaurant in 2021
Given a specific `restaurant_id`, compute monthly total sales in 2021 and the month-over-month (MoM) change, excluding the first month in the series (i.e., only months where a prior month exists).
**Output:**
- `year_month`
- `restaurant_id`
- `monthly_sales`
- `mom_sales_change` (current month sales − previous month sales)
**Follow-up:** How would you change the query to return the MoM change for **all restaurants**?
---
## Q4) Percentage of customers ordering from bottom-quartile restaurants
For each month, rank restaurants by **that month’s total sales** and label restaurants into quartiles (4 buckets) within the month.
Define “bottom quartile restaurants” as the **lowest 25%** by monthly sales for that month.
Compute, for each month, the percentage of distinct customers who placed **at least one order** from a bottom-quartile restaurant.
**Output:**
- `month`
- `pct_customers_ordered_bottom_quartile`
**Notes:**
- Clarify and handle ties consistently (e.g., using `NTILE(4)` over monthly restaurant sales).
- A customer should be counted at most once per month in numerator/denominator.
Quick Answer: This question evaluates SQL data-manipulation and analytical skills, including temporal aggregation, grouping and counting, percent calculations, ranking/ntile-based quartile assignment, and windowed month-over-month comparisons applied to a marketplace orders dataset.