Clean and Analyze User Transactions with Python Functions
Company: PayPal
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
transactions
+---------+---------------------+---------+
| user_id | trans_ts | amount |
+---------+---------------------+---------+
| 11 |2024-06-03 10:00:00 | 25.80 |
| 11 |2024-06-03 10:05:00 | 10.50 |
| 12 |2024-06-03 12:00:00 | 40.00 |
| 11 |2024-06-04 09:00:00 | 15.00 |
| 12 |2024-06-05 13:20:00 | 33.30 |
+---------+---------------------+---------+
##### Scenario
Analyst must clean monthly transaction logs and derive user-level features for downstream modeling.
##### Question
Implement a Python function that removes users with fewer than 100 transactions per calendar month.
Implement another function that returns each user's average time between consecutive transactions in seconds.
##### Hints
Use pandas groupby with size()/filter and shift() on sorted timestamps; convert Timedelta to .dt.total_seconds().
Quick Answer: This question evaluates proficiency in data manipulation and feature engineering with Python and pandas, specifically cleaning transactional logs and deriving user-level time-based metrics such as inter-event intervals.