Merge and Concatenate Inconsistent Order Files with Pandas
Company: Boston Consulting Group
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Take-home Project
orders_2023
+----------+-------------+--------+
| order_id | customer_id | amount |
+----------+-------------+--------+
| 101 | C001 | 120.5 |
| 102 | C002 | 75.0 |
| 103 | C003 | 140.0 |
+----------+-------------+--------+
orders_2024
+----------+-------------+--------+
| orderid | customer_id | amount |
+----------+-------------+--------+
| 201 | C001 | 110.0 |
| 202 | C004 | 95.0 |
| 203 | C005 | 180.0 |
+----------+-------------+--------+
##### Scenario
BCG CodeSignal notebook – merging annual order files with schema inconsistencies
##### Question
Using Python (pandas), load orders_2023.csv and orders_2024.csv, rename columns so both have ['order_id','customer_id','amount'], cast amount to float, then vertically concatenate them into one DataFrame called orders_all.
##### Hints
read_csv ➜ rename ➜ astype ➜ concat; watch the typo in orderid.
Quick Answer: This question evaluates proficiency with pandas-based data manipulation, including schema alignment, column renaming, type casting, and vertical concatenation when merging CSV files.