Analyze User Engagement with SQL Queries
Company: Amazon
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
events
+----------+---------+---------------------+
| event_id | user_id | event_time |
+----------+---------+---------------------+
| 1 | 101 | 2023-01-01 09:00:00 |
| 2 | 101 | 2023-01-01 10:30:00 |
| 3 | 102 | 2023-01-02 08:15:00 |
| 4 | 103 | 2023-01-03 12:45:00 |
+----------+---------+---------------------+
sessions
+-----------+---------+---------------------+---------------------+
| session_id| user_id | session_start | session_end |
+-----------+---------+---------------------+---------------------+
| 1001 | 101 | 2023-01-01 08:45:00 | 2023-01-01 10:45:00 |
| 1002 | 102 | 2023-01-02 08:00:00 | 2023-01-02 09:00:00 |
| 1003 | 103 | 2023-01-03 08:30:00 | 2023-01-03 13:30:00 |
+-----------+---------+---------------------+---------------------+
##### Scenario
You are a data analyst exploring product-usage logs and session metadata to understand user engagement.
##### Question
Q1. Using table events, write a SQL query that returns the total number of events generated by each user (user_id). Q2. Using table events, return for every user_id the earliest event_time and the latest event_time observed. Q3. Using tables events and sessions, write a SQL query that finds the user_id whose single session has the longest duration (session_end – session_start); return both the user_id and that duration.
##### Hints
Window functions, GROUP BY, TIMESTAMPDIFF; join events e ON e.user_id = s.user_id when needed.
Quick Answer: This question evaluates proficiency in SQL data manipulation, including aggregation, window functions, joins, and temporal calculations for analyzing event and session logs.