Manipulate time-series with Pandas groupby
Company: Amazon
Role: Software Engineer
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Given a DataFrame events(user_id, event_type, ts_utc, revenue):
1) Parse ts_utc as timezone-aware, convert to America/Los_Angeles, and handle DST transitions.
2) Compute daily active users (DAU) and a 7-day moving average.
3) For each user and event_type, compute a 7-day rolling count.
4) Produce weekly retention: the number and rate of users active in week w who return in week w+1.
5) Resample to fill missing calendar dates with zeros. Provide idiomatic, vectorized Pandas code (no explicit Python loops).
Quick Answer: This question evaluates proficiency in time-series data manipulation, including timezone-aware datetime parsing and DST handling, groupby and rolling-window aggregations, resampling to fill calendar gaps, and user-level retention and DAU calculations using idiomatic, vectorized Pandas or SQL operations.