Analyze Retention Metrics Using SQL and Python
Company: Netflix
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Other
transactions
+----------+--------+------------+---------+-----------+
| user_id | txn_id | txn_date | amount | is_fraud |
+----------+--------+------------+---------+-----------+
| 101 | 9001 | 2024-04-01 | 13.99 | 0 |
| 102 | 9002 | 2024-04-02 | 7.99 | 1 |
| 101 | 9003 | 2024-04-03 | 15.49 | 0 |
| 103 | 9004 | 2024-04-05 | 11.99 | 0 |
| 102 | 9005 | 2024-04-06 | 8.49 | 1 |
+----------+--------+------------+---------+-----------+
##### Scenario
Analyst is provided with transaction logs and must write SQL and lightweight Python to build retention metrics.
##### Question
Write a SQL query that returns, for each user, the first transaction date and the number of transactions made within 7 days of that first purchase. Write a SQL query that computes overall Day-7 retention rate. Given a Python list of n integers, write a function that returns the two numbers whose sum is closest to zero (assume at least two numbers).
##### Hints
Use window functions for datediff; in Python aim for O(n log n) or better.
Quick Answer: This question evaluates proficiency in SQL time-based cohort analysis and practical Python algorithm implementation, testing retention metric computation using windowed date calculations and an efficient approach for finding two numbers whose sum is closest to zero.