Evaluates cumulative aggregation and temporal grouping skills in the Data Manipulation (SQL/Python) domain, emphasizing window functions for year-to-date running totals, date/week bucketing, partitioning by year, deduplicating distinct customers, and considerations for aggregation performance.
You have a transactions table:
transactions
Task (can be solved in SQL or in Python/pandas):
For each calendar week, compute the number of distinct customers whose year-to-date (YTD) cumulative spend is at least $1000 as of the end of that week.
Definitions/assumptions:
Required output:
Order results by year, week_start_date.