Use PostgreSQL. Assume today = 2025-09-01. You must use CTEs and multiple window functions. Schema and tiny samples are below.
Schema:
-
exposures(unit_id INT, dt DATE, condition_label INT) -- 1=treatment, 0=control
-
orders(order_id INT, unit_id INT, dt DATE, courier_type TEXT, temp_category TEXT, subtotal_cents INT)
Sample rows:
exposures
+---------+------------+----------------+
| unit_id | dt | condition_label|
+---------+------------+----------------+
| 101 | 2025-08-30 | 1 |
| 101 | 2025-08-31 | 1 |
| 101 | 2025-09-01 | 0 |
| 102 | 2025-08-30 | 0 |
| 102 | 2025-08-31 | 1 |
| 102 | 2025-09-01 | 1 |
| 103 | 2025-08-31 | 0 |
| 103 | 2025-09-01 | 0 |
+---------+------------+----------------+
orders
+----------+---------+------------+--------------+---------------+----------------+
| order_id | unit_id | dt | courier_type | temp_category | subtotal_cents |
+----------+---------+------------+--------------+---------------+----------------+
| 1 | 101 | 2025-08-30 | biker | cold | 1800 |
| 2 | 101 | 2025-08-30 | biker | hot | 2400 |
| 3 | 101 | 2025-08-31 | car | cold | 2200 |
| 4 | 101 | 2025-09-01 | biker | cold | 2600 |
| 5 | 102 | 2025-08-30 | biker | cold | 2100 |
| 6 | 102 | 2025-08-31 | biker | hot | 1500 |
| 7 | 102 | 2025-09-01 | car | cold | 2000 |
| 8 | 103 | 2025-08-31 | biker | cold | 900 |
| 9 | 103 | 2025-09-01 | car | hot | 3000 |
| 10 | 103 | 2025-08-28 | biker | cold | 1700 |
+----------+---------+------------+--------------+---------------+----------------+
Tasks:
-
Non-aggregated condition percent (simple condition for numerator): Define is_high_value := (subtotal_cents >= 2000). For the 7-day window [2025-08-26, 2025-09-01], compute the percentage of orders that satisfy (courier_type='biker' AND temp_category='cold' AND is_high_value). Produce two variants of the denominator: V1 includes only orders whose unit_id had condition_label=1 on that same dt; V2 includes orders whose unit_id had condition_label in {0,1} on that dt (treat missing exposure rows as condition_label=0). Output both percentages with clearly labeled columns.
-
Aggregation-required condition percent: For each unit_id×dt in the same 7-day window, define is_power_day := (count of orders in the last 7 days inclusive for that unit with courier_type='biker' AND temp_category='cold' >= 2) AND (average subtotal_cents on those qualifying orders in the same 7-day window >= 2000). Compute the percentage of unit-days where is_power_day=TRUE under the same two denominator conventions as in (1): DV1 counts only unit-days with condition_label=1; DV2 counts unit-days with condition_label in {0,1}. Return a single row with DV1_percent and DV2_percent.
-
Multiple window frames and change: Build a unit-day series from the orders table (days with 0 orders must appear). For each unit_id and dt in [2025-08-26, 2025-09-01], compute: (a) orders_day := total orders that day; (b) cum_orders := SUM(orders_day) OVER (PARTITION BY unit_id ORDER BY dt ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW); (c) rolling7 := SUM(orders_day) OVER (PARTITION BY unit_id ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW); (d) pct_change_do_d := (orders_day - LAG(orders_day) OVER w1) / NULLIF(LAG(orders_day) OVER w1,0), where w1=(PARTITION BY unit_id ORDER BY dt); (e) non-overlapping_7day_change := (rolling7 - LAG(rolling7) OVER w7) / NULLIF(LAG(rolling7) OVER w7,0), where w7 uses ORDER BY dt with a frame ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING to represent the prior 7 days. Return one row per unit_id×dt with these columns.
-
Within-unit state changes vs partition: For each unit_id, define had_high_value_biker_cold_today := (exists an order on dt with courier_type='biker' AND temp_category='cold' AND subtotal_cents>=2000). For each change point where this boolean flips (0→1 or 1→0), output unit_id, dt_of_change, direction, delta_rolling7 := rolling7_after - rolling7_before (rolling7 defined on orders_day), and the percentile_rank of delta_rolling7 among all change deltas across units in the window using PERCENT_RANK() OVER (ORDER BY delta_rolling7).