
You are given the following BigQuery-style schema and tiny samples (assume timestamps are UTC; assume promotions.discount_amount is the applied discount). Use today = 2025-09-01 when interpreting “last 7 days.” Write precise SQL answers for (a)–(f), explaining any windowing or edge-case assumptions.
customers(customer_id INT, signup_date DATE, city STRING) Sample: customer_id | signup_date | city 1 | 2025-07-10 | SF 2 | 2025-08-01 | SF 3 | 2025-08-20 | NYC
merchants(merchant_id INT, name STRING, city STRING, cuisine STRING) Sample: merchant_id | name | city | cuisine 10 | SushiCo | SF | Japanese 11 | TacoTown | SF | Mexican 12 | PizzaBox | NYC | Italian
dashers(dasher_id INT, signup_date DATE, vehicle_type STRING) Sample: dasher_id | signup_date | vehicle_type 100 | 2025-06-01 | car 101 | 2025-08-15 | bike
promotions(promo_id INT, promo_type STRING, discount_amount NUMERIC, start_date DATE, end_date DATE) Sample: promo_id | promo_type | discount_amount | start_date | end_date 900 | pct10 | 10.00 | 2025-08-01 | 2025-08-31 901 | flat5 | 5.00 | 2025-08-10 | 2025-09-10
orders(order_id INT, customer_id INT, merchant_id INT, dasher_id INT, created_at TIMESTAMP, accepted_at TIMESTAMP, picked_up_at TIMESTAMP, delivered_at TIMESTAMP, canceled_at TIMESTAMP, subtotal NUMERIC, delivery_fee NUMERIC, tip_amount NUMERIC, promo_id INT, status STRING) Sample: order_id | customer_id | merchant_id | dasher_id | created_at | accepted_at | picked_up_at | delivered_at | canceled_at | subtotal | delivery_fee | tip_amount | promo_id | status 5001 | 1 | 10 | 100 | 2025-08-28 12:00 | 2025-08-28 12:02 | 2025-08-28 12:20 | 2025-08-28 12:35 | null | 30.00 | 3.99 | 5.00 | 900 | delivered 5002 | 2 | 11 | null| 2025-08-29 18:05 | null | null | null | 2025-08-29 18:10 | 15.00 | 2.99 | 0.00 | null | canceled 5003 | 1 | 11 | 101 | 2025-08-31 19:00 | 2025-08-31 19:04 | 2025-08-31 19:20 | null | null | 20.00 | 2.99 | 3.00 | 901 | in_progress 5004 | 3 | 12 | 100 | 2025-08-24 11:30 | 2025-08-24 11:31 | null | 2025-08-24 12:10 | null | 22.00 | 4.99 | 0.00 | null | delivered
order_events(order_id INT, event_time TIMESTAMP, event_type STRING) Sample: order_id | event_time | event_type 5001 | 2025-08-28 12:00 | created 5001 | 2025-08-28 12:02 | accepted 5001 | 2025-08-28 12:20 | picked_up 5001 | 2025-08-28 12:35 | delivered 5002 | 2025-08-29 18:05 | created 5002 | 2025-08-29 18:10 | canceled 5003 | 2025-08-31 19:00 | created 5003 | 2025-08-31 19:04 | accepted 5004 | 2025-08-24 11:30 | created 5004 | 2025-08-24 11:31 | accepted 5004 | 2025-08-24 12:10 | delivered
Tasks: (a) Compute daily on-time delivery rate per city for the last 7 days (2025-08-26 to 2025-09-01). Define on-time as delivered_at <= created_at + INTERVAL 45 MINUTE for non-canceled orders; treat null delivered_at as late. Show city, date, on_time_deliveries, total_eligible, on_time_rate. (b) For August 2025, list top 3 merchants per city by GMV where GMV = subtotal + delivery_fee + tip_amount − COALESCE(promotions.discount_amount,0), excluding canceled orders. Break ties by fewer cancellations, then by merchant_id. Output city, merchant_id, rank_in_city, gmv, cancellations. (c) Using order_events, find orders that have delivered event without a preceding picked_up event (strictly earlier). Return order_id and a minimal event timeline proving the anomaly. (d) Compute weekly dasher acceptance rate: among orders created in a given ISO week and assigned to a dasher, the share with accepted_at within 3 minutes of created_at. Exclude orders canceled before accepted_at. Output dasher_id, week_start_date, assigned_orders, accepted_within_3m, acceptance_rate. (e) Find customers who placed exactly two lifetime orders and both were canceled. Return customer_id and the two canceled_at timestamps in ascending order. (f) For each merchant in August 2025, estimate uplift in average order value (AOV) when a promo is applied vs not applied. Output merchant_id, promo_orders, non_promo_orders, aov_promo, aov_nonpromo, diff. Then describe (briefly) how you would compute a 95% CI for diff via a SQL-based bootstrap or a Python snippet using the same tables.