Given two pandas DataFrames, write code to: (1) merge and aggregate revenue; (2) produce a 2x2 pivot; (3) compute per-state counts with value_counts, nunique/size; (4) add a binary flag via np.where. Reuse the merged DataFrame across parts (assume it persists between steps).
Data (toy, representative)
users
user_id | is_member | state | age
101 | 1 | CA | 29
102 | 0 | NY | 41
103 | 1 | CA | 35
104 | 0 | TX | 50
orders
order_id | user_id | channel | amount | status
7001 | 101 | SMS | 12.00 | delivered
7002 | 102 | Email | 5.00 | delivered
7003 | 103 | SMS | 7.00 | delivered
7004 | 103 | Email | 4.00 | delivered
7005 | 101 | Organic | 3.50 | delivered
7006 | 104 | SMS | 6.00 | undelivered
Tasks
-
Step 1: Merge orders with users on user_id (left join). Compute two outputs: (a) total delivered revenue by channel; (b) delivered revenue by channel restricted to members (is_member==1). Show groupby(...).sum() results as DataFrames.
-
Step 2: Create a 2x2 pivot of delivered revenue with index=is_member (0/1) and columns=channel in ['SMS','Email'] only, values=amount, aggfunc='sum', fill missing cells with 0. Use pivot_table with aggfunc='sum'.
-
Step 3: From the merged DataFrame, compute per-state: total orders (size) and unique purchasers (nunique of user_id). Return the top-2 states by total orders using sort_values.
-
Step 4: Add column high_value_flag = 1 if (user's lifetime delivered amount >= 15) OR (number of delivered SMS orders per user >= 2), else 0. Use np.where and prior groupby aggregations to avoid SettingWithCopy warnings. Show the final head with relevant columns.