Merge and Clean Customer Order Data for Analysis
Company: Boston Consulting Group
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Take-home Project
customers
+----+---------+---------+
| id | name | country |
+----+---------+---------+
| 1 | Alice | US |
| 2 | Bob | UK |
| 3 | Charlie | NULL |
+----+---------+---------+
orders
+-----+-------------+--------+
| id | customer_id | amount |
+-----+-------------+--------+
| 101 | 1 | 250.50 |
| 102 | 2 | 99.99 |
| 103 | 2 | NULL |
+-----+-------------+--------+
##### Scenario
A retail company needs to combine customer and order datasets, clean nulls, and prepare the data for downstream analysis.
##### Question
In Python/pandas, merge the two datasets on customer_id, keep all customers, and add a column total_spent that replaces NULL amounts with 0. Fill missing country values with the mode of existing countries. Return the resulting DataFrame sorted by total_spent descending.
##### Hints
Use merge, fillna, and groupby/transform for the mode.
Quick Answer: This question evaluates proficiency in data manipulation with pandas and SQL concepts, focusing on dataset merging, missing-value handling, and computing derived aggregate columns such as total_spent.