You’re analyzing Home Improvement retail transactions to understand sales of Mulch during promotions. After joining multiple tables, your final result has far fewer rows than expected.
Assume the following schemas (all dates are in the same timezone; DATE has no time component):
salessale_id
(BIGINT, PK)
sale_date
(DATE)
store_id
(INT)
customer_id
(BIGINT)
sale_itemssale_id
(BIGINT, FK →
sales.sale_id
)
line_id
(INT) — unique within a sale
product_id
(BIGINT)
qty
(INT)
unit_price
(DECIMAL(10,2))
productsproduct_id
(BIGINT, PK)
category
(VARCHAR)
sub_category
(VARCHAR)
promotionspromo_id
(BIGINT, PK)
product_id
(BIGINT)
start_date
(DATE)
end_date
(DATE)
discount_pct
(DECIMAL(5,2))
One row per day for the last 30 days:
sale_date
mulch_units
= total units of mulch sold (promo + non-promo)
mulch_revenue
= total revenue from mulch
promo_mulch_units
= mulch units sold while an applicable promo was active
SELECT
s.sale_date,
SUM(i.qty) AS mulch_units,
SUM(i.qty * i.unit_price) AS mulch_revenue,
SUM(CASE WHEN p.promo_id IS NOT NULL THEN i.qty ELSE 0 END) AS promo_mulch_units
FROM sales s
JOIN sale_items i
ON s.sale_id = i.sale_id
JOIN products pr
ON pr.product_id = i.product_id
LEFT JOIN promotions p
ON p.product_id = i.product_id
AND s.sale_date BETWEEN p.start_date AND p.end_date
WHERE pr.category = 'Mulch'
AND p.discount_pct > 0
AND s.sale_date >= CURRENT_DATE - INTERVAL '30' DAY
GROUP BY 1;