PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Coding & Algorithms/Capital One

Refactor code and enforce robustness

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in code refactoring, robustness, input validation, unit testing, environment specification, and complexity analysis for Python data-processing workflows (pandas), within the Coding & Algorithms domain for Data Scientist roles.

  • medium
  • Capital One
  • Coding & Algorithms
  • Data Scientist

Refactor code and enforce robustness

Company: Capital One

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

You are given this Python script: # script.py import pandas as pd DATA_PATH = 'data.csv' result = None def compute_total(col): df = pd.read_csv(DATA_PATH) total = 0 for x in df[col]: if x == '': total += 0 else: total += float(x) print(total) compute_total('amount') Tasks: (1) Identify at least five defects or risks (correctness, performance, readability, resource management, security). (2) Refactor into a small, testable module with clear interfaces, type hints, and no global state; include input validation and assert-based precondition checks (and explain when assertions vs. exceptions are appropriate). (3) Write three pytest-style unit tests using assert statements that cover normal, missing/NaN, and malformed inputs. (4) Provide an environment.yml for a Conda environment (Python 3.11, pinned dependencies) and the exact commands to create/activate it. (5) Explain the benefits of modularization for maintainability, dependency management, and testability. (6) State the time/space complexity before and after refactoring and any I/O bottlenecks you’d address.

Quick Answer: This question evaluates proficiency in code refactoring, robustness, input validation, unit testing, environment specification, and complexity analysis for Python data-processing workflows (pandas), within the Coding & Algorithms domain for Data Scientist roles.

Related Interview Questions

  • Solve Four Coding Assessment Tasks - Capital One (medium)
  • Write SQL using joins and window functions - Capital One (medium)
  • Review Preprocessing Code and Tests - Capital One (easy)
  • Remove nodes with a given value - Capital One (medium)
  • Solve multiple algorithmic interview questions - Capital One (hard)
Capital One logo
Capital One
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Coding & Algorithms
3
0

Code Review and Refactor: Summing a CSV Column

Context

You are reviewing a short Python script that sums a numeric column from a CSV using pandas. Your tasks are to identify problems, refactor into a small, testable module, add tests, define an environment, and explain design choices and complexity trade-offs.

Given Script

# script.py
import pandas as pd
DATA_PATH = 'data.csv'
result = None

def compute_total(col):
    df = pd.read_csv(DATA_PATH)
    total = 0
    for x in df[col]:
        if x == '':
            total += 0
        else:
            total += float(x)
    print(total)

compute_total('amount')

Tasks

  1. Identify at least five defects or risks (correctness, performance, readability, resource management, security).
  2. Refactor into a small, testable module with clear interfaces, type hints, and no global state; include input validation and assert-based precondition checks. Explain when assertions vs. exceptions are appropriate.
  3. Write three pytest-style unit tests using assert statements that cover:
    • Normal inputs
    • Missing/NaN inputs
    • Malformed inputs
  4. Provide an environment.yml for a Conda environment (Python 3.11, pinned dependencies) and the exact commands to create/activate it.
  5. Explain the benefits of modularization for maintainability, dependency management, and testability.
  6. State the time/space complexity before and after refactoring and any I/O bottlenecks you would address.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Capital One•More Data Scientist•Capital One Data Scientist•Capital One Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.