Calculate User Revenue and Session Duration in Python
Company: Upstart
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
events
+---------+------------+---------+---------------------+
| user_id | event_type | revenue | timestamp |
+---------+------------+---------+---------------------+
| 101 | view | 0.00 | 2023-01-01 10:00:00 |
| 101 | purchase | 29.99 | 2023-01-01 10:05:12 |
| 202 | view | 0.00 | 2023-01-01 11:20:33 |
| 202 | purchase | 15.50 | 2023-01-01 11:22:10 |
| 202 | logout | 0.00 | 2023-01-01 11:30:45 |
+---------+------------+---------+---------------------+
##### Scenario
An e-commerce company stores clickstream events in a Python list of JSON objects; analysts need per-user revenue and average session duration for reporting.
##### Question
Write a Python function that takes a list of dictionaries each with keys ['user_id', 'event_type', 'revenue', 'timestamp'] and returns, for every user, total_revenue and average_session_duration (in seconds). Explain complexity.
##### Hints
Use one pass with defaultdict. A session starts with the first event and ends with the last for that user; avoid multiple sorts.
Quick Answer: This question evaluates proficiency in data manipulation and time-series event processing in Python, focusing on per-user revenue aggregation and sessionization to compute average session duration.