How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

What difficulty level is this interview question?

This is a medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at DoorDash.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at DoorDash during technical interviews.

Write SQL for monthly spend and ratios | DoorDash Interview Question

Write SQL for monthly spend and ratios

Company: DoorDash

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: medium

Interview Round: Technical Screen

Assume you are working with a food delivery dataset. ### Tables (schemas) **users** - user_id INT PRIMARY KEY - annual_income_usd INT - signup_ts TIMESTAMP **restaurants** - restaurant_id INT PRIMARY KEY - neighborhood_income_usd INT -- e.g., median income of the restaurant’s census tract **orders** - order_id INT PRIMARY KEY - user_id INT REFERENCES users(user_id) - restaurant_id INT REFERENCES restaurants(restaurant_id) - order_ts TIMESTAMP -- UTC - order_amount_usd NUMERIC(10,2) - status STRING -- e.g., 'completed', 'canceled' ### Definitions / assumptions - Only include orders with status = 'completed'. - “Month” means calendar month in UTC: DATE_TRUNC('month', order_ts). - When ties occur for “top spender”, return the smallest user_id. ### Tasks 1) **Monthly spend > $100:** For each user-month, compute total spend. Return rows where the user’s monthly spend is strictly greater than 100. - Output: month, user_id, monthly_spend 2) **Top spender each month among (1):** Considering only the user-months returned by (1), find the top spender for each month. - Output: month, user_id, monthly_spend 3) **Restaurant monthly spend change (MoM):** For each restaurant and month, compute monthly revenue and the month-over-month change (absolute and percent) versus the previous month. - Output: restaurant_id, month, monthly_revenue, prev_month_revenue, mom_change, mom_change_pct - Only output months where a previous month exists for that restaurant. 4) **Share of customers ordering from bottom-income restaurants:** Define “bottom 25% income restaurants” as restaurants whose neighborhood_income_usd is in the bottom quartile across all restaurants (using NTILE/percentile logic). For each month, compute: - numerator = number of distinct users who placed ≥1 completed order in that month at a bottom-quartile restaurant - denominator = number of distinct users who placed ≥1 completed order in that month at any restaurant - ratio = numerator / denominator Output: month, numerator_users, denominator_users, ratio

Quick Answer: This question evaluates SQL skills in time-based aggregation, joins, filtering, window functions, percentile/NTILE segmentation, distinct counts, and calculation of month-over-month revenue changes and user-share ratios.

Assume you are working with a food delivery dataset.

Tables (schemas)

users

user_id INT PRIMARY KEY
annual_income_usd INT
signup_ts TIMESTAMP

restaurants

restaurant_id INT PRIMARY KEY
neighborhood_income_usd INT -- e.g., median income of the restaurant’s census tract

orders

order_id INT PRIMARY KEY
user_id INT REFERENCES users(user_id)
restaurant_id INT REFERENCES restaurants(restaurant_id)
order_ts TIMESTAMP -- UTC
order_amount_usd NUMERIC(10,2)
status STRING -- e.g., 'completed', 'canceled'

Definitions / assumptions

Only include orders with status = 'completed'.
“Month” means calendar month in UTC: DATE_TRUNC('month', order_ts).
When ties occur for “top spender”, return the smallest user_id.

Tasks

Monthly spend > $100: For each user-month, compute total spend. Return rows where the user’s monthly spend is strictly greater than 100.
- Output: month, user_id, monthly_spend
Top spender each month among (1): Considering only the user-months returned by (1), find the top spender for each month.
- Output: month, user_id, monthly_spend
Restaurant monthly spend change (MoM): For each restaurant and month, compute monthly revenue and the month-over-month change (absolute and percent) versus the previous month.
- Output: restaurant_id, month, monthly_revenue, prev_month_revenue, mom_change, mom_change_pct
- Only output months where a previous month exists for that restaurant.
Share of customers ordering from bottom-income restaurants: Define “bottom 25% income restaurants” as restaurants whose neighborhood_income_usd is in the bottom quartile across all restaurants (using NTILE/percentile logic).

For each month, compute:

numerator = number of distinct users who placed ≥1 completed order in that month at a bottom-quartile restaurant
denominator = number of distinct users who placed ≥1 completed order in that month at any restaurant
ratio = numerator / denominator

Output: month, numerator_users, denominator_users, ratio

Write SQL for monthly spend and ratios

Company: DoorDash

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: medium

Interview Round: Technical Screen

Assume you are working with a food delivery dataset.

Tables (schemas)

users

user_id INT PRIMARY KEY
annual_income_usd INT
signup_ts TIMESTAMP

restaurants

restaurant_id INT PRIMARY KEY
neighborhood_income_usd INT -- e.g., median income of the restaurant’s census tract

orders

order_id INT PRIMARY KEY
user_id INT REFERENCES users(user_id)
restaurant_id INT REFERENCES restaurants(restaurant_id)
order_ts TIMESTAMP -- UTC
order_amount_usd NUMERIC(10,2)
status STRING -- e.g., 'completed', 'canceled'

Definitions / assumptions

Only include orders with status = 'completed'.
“Month” means calendar month in UTC: DATE_TRUNC('month', order_ts).
When ties occur for “top spender”, return the smallest user_id.

Tasks

Monthly spend > $100: For each user-month, compute total spend. Return rows where the user’s monthly spend is strictly greater than 100.
- Output: month, user_id, monthly_spend
Top spender each month among (1): Considering only the user-months returned by (1), find the top spender for each month.
- Output: month, user_id, monthly_spend
Restaurant monthly spend change (MoM): For each restaurant and month, compute monthly revenue and the month-over-month change (absolute and percent) versus the previous month.
- Output: restaurant_id, month, monthly_revenue, prev_month_revenue, mom_change, mom_change_pct
- Only output months where a previous month exists for that restaurant.
Share of customers ordering from bottom-income restaurants: Define “bottom 25% income restaurants” as restaurants whose neighborhood_income_usd is in the bottom quartile across all restaurants (using NTILE/percentile logic).

For each month, compute:

numerator = number of distinct users who placed ≥1 completed order in that month at a bottom-quartile restaurant
denominator = number of distinct users who placed ≥1 completed order in that month at any restaurant
ratio = numerator / denominator

Output: month, numerator_users, denominator_users, ratio

Write SQL for monthly spend and ratios

Quick Overview

Tables (schemas)

Definitions / assumptions

Tasks

Comments (0)

Write SQL for monthly spend and ratios

Quick Overview

Tables (schemas)

Definitions / assumptions

Tasks

Comments (0)