PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Data Manipulation (SQL/Python)/Capital One

Merge CSVs and build revenue pivot with pandas

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in data manipulation and ETL using pandas, covering robust file I/O, join/merge semantics, null/coalescing behavior, date parsing, aggregation, and pivoting.

  • Medium
  • Capital One
  • Data Manipulation (SQL/Python)
  • Data Scientist

Merge CSVs and build revenue pivot with pandas

Company: Capital One

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Take-home Project

You receive four CSVs and must replicate an Excel VLOOKUP + PivotTable workflow using Python/pandas. CSV samples: customers.csv customer_id,signup_date,channel 1,2025-06-02,Organic 2,2025-06-10,Ads 3,2025-07-05,Referral orders.csv order_id,customer_id,order_date,total_amount 101,1,2025-06-12,120 102,2,2025-06-28,80 103,1,2025-07-15,200 104,3,2025-08-02,50 refunds.csv order_id,refund_amount 102,20 104,50 targets.csv month,channel,revenue_target 2025-06,Organic,100 2025-06,Ads,120 2025-07,Organic,150 2025-08,Referral,60 Task: Write pandas code to (a) robustly load these CSVs (assume one file path may initially fail—gracefully retry or fall back without crashing), (b) compute net_revenue per order = total_amount - COALESCE(refund_amount,0), (c) roll up to monthly net revenue by channel for months 2025-06 through 2025-08 based on order_date, (d) left-join the result to targets.csv on (month, channel) to compute target_gap = net_revenue - revenue_target (treat missing targets as 0), and (e) produce a pivot table with index=month (YYYY-MM), columns=channel, values=net_revenue, plus an additional similarly-shaped table for target_gap. Ensure date parsing is correct, missing refunds are handled, and channels with no orders in a month still appear with 0 in the pivot.

Quick Answer: This question evaluates proficiency in data manipulation and ETL using pandas, covering robust file I/O, join/merge semantics, null/coalescing behavior, date parsing, aggregation, and pivoting.

Related Interview Questions

  • Clean and Merge Housing Data - Capital One (easy)
  • Find Lowest Prices for Highly Rated Categories - Capital One (medium)
  • Write SQL to compute campaign net revenue - Capital One (Medium)
  • Find top category per region in Aug 2025 - Capital One (Medium)
  • Reconcile ledgers with SQL/Python and late events - Capital One (Medium)
Capital One logo
Capital One
Oct 13, 2025, 9:49 PM
Data Scientist
Take-home Project
Data Manipulation (SQL/Python)
3
0

You receive four CSVs and must replicate an Excel VLOOKUP + PivotTable workflow using Python/pandas.

CSV samples: customers.csv customer_id,signup_date,channel 1,2025-06-02,Organic 2,2025-06-10,Ads 3,2025-07-05,Referral

orders.csv order_id,customer_id,order_date,total_amount 101,1,2025-06-12,120 102,2,2025-06-28,80 103,1,2025-07-15,200 104,3,2025-08-02,50

refunds.csv order_id,refund_amount 102,20 104,50

targets.csv month,channel,revenue_target 2025-06,Organic,100 2025-06,Ads,120 2025-07,Organic,150 2025-08,Referral,60

Task: Write pandas code to (a) robustly load these CSVs (assume one file path may initially fail—gracefully retry or fall back without crashing), (b) compute net_revenue per order = total_amount - COALESCE(refund_amount,0), (c) roll up to monthly net revenue by channel for months 2025-06 through 2025-08 based on order_date, (d) left-join the result to targets.csv on (month, channel) to compute target_gap = net_revenue - revenue_target (treat missing targets as 0), and (e) produce a pivot table with index=month (YYYY-MM), columns=channel, values=net_revenue, plus an additional similarly-shaped table for target_gap. Ensure date parsing is correct, missing refunds are handled, and channels with no orders in a month still appear with 0 in the pivot.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Capital One•More Data Scientist•Capital One Data Scientist•Capital One Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.