Use the schema and sample data below to answer SQL and pandas tasks. Treat 'today' as 2025-09-01.
Schema users(user_id INT, country STRING) pins(pin_id INT, creator_user_id INT, is_shopping_enabled BOOLEAN, created_at DATE) events(event_id INT, user_id INT, pin_id INT, event_type STRING, event_ts TIMESTAMP, stay_time_sec INT, feature STRING, product_category_code STRING) product_categories(code STRING, name STRING)
Sample tables (small) users user_id | country 1 | US 2 | US 3 | CA
pins pin_id | creator_user_id | is_shopping_enabled | created_at 10 | 3 | true | 2025-08-20 11 | 2 | false | 2025-08-22 12 | 1 | true | 2025-08-25
events event_id | user_id | pin_id | event_type | event_ts | stay_time_sec | feature | product_category_code 101 | 1 | 10 | shopping_click | 2025-08-30 10:00:00 | 35 | shopping_module | A 102 | 1 | 10 | shopping_click | 2025-09-01 09:00:00 | 50 | shopping_module | null 103 | 2 | 12 | shopping_click | 2025-08-31 12:00:00 | null | shopping_module | H 104 | 2 | 11 | view | 2025-09-01 08:00:00 | null | feed | null 105 | 3 | 10 | shopping_click | 2025-08-28 07:00:00 | 15 | shopping_module | A 106 | 3 | 10 | shopping_click | 2025-09-01 11:59:00 | -5 | shopping_module | ?
product_categories code | name A | Apparel H | Home ? | Unknown
Tasks SQL
Python (pandas) Given a DataFrame events_df with the same columns as events: (a) Map product_category_code using dict = {'A':'Apparel','H':'Home','?':'Unknown'} so missing/unknown codes become 'Unknown'. (b) Replace negative stay_time_sec with NaN; fill remaining NaN stay_time_sec with 0 for aggregation but exclude zeros from averages where appropriate. (c) Sort events_df by ['user_id' asc, 'event_ts' desc, 'stay_time_sec' desc]. (d) Compute, for the last 7 days ending 2025-09-01, each user's top category by total positive stay_time_sec and return a Series user_id -> top_category (tie-breaker: alphabetical).