PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Microsoft

Build a CSV Query Engine

Last updated: Apr 6, 2026

Quick Overview

This question evaluates proficiency in text parsing, data modeling, and implementing in-memory query semantics, including CSV parsing rules, type inference for numeric fields, and correct numeric versus lexicographic comparisons.

  • medium
  • Microsoft
  • Coding & Algorithms
  • Software Engineer

Build a CSV Query Engine

Company: Microsoft

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Implement a small in-memory CSV query engine. You are given CSV text where the first line contains column headers and each remaining line contains one data row. Example: ```text Key,location,weather,temperature,data 1,"Sunnyvale","sunny",100,"datetimestamp" ``` Your tasks are: 1. **Parse the CSV into rows** - Convert the CSV into a list of row objects, where each row is a dictionary mapping `column_name -> value`. - The first row defines the headers. - A field enclosed in double quotes must remain a string. - An unquoted field that contains only digits must be converted to an integer. - Other unquoted fields may be treated as strings. - Quoted fields may contain commas, so your parser must not split on commas inside double quotes. 2. **Support single-column queries** - Given a column name, return all values from that column in row order. - For example, this should behave like collecting `row[col_name]` for every row. 3. **Support filtered projection queries** - Implement a query function that takes: - a list of selected columns - a list of filters of the form: ```python filters = [(column_name, operator, value), ...] ``` - Supported operators are `<`, `>`, and `=`. - A row matches only if it satisfies **all** filters. - Return only the requested columns for each matching row. - Comparisons must work correctly for both parsed integers and strings. Use numeric comparison for integers and lexicographic comparison for strings. You may assume the CSV input is valid and that referenced columns exist.

Quick Answer: This question evaluates proficiency in text parsing, data modeling, and implementing in-memory query semantics, including CSV parsing rules, type inference for numeric fields, and correct numeric versus lexicographic comparisons.

Related Interview Questions

  • Return Top K Open Businesses - Microsoft (hard)
  • Implement Memory Allocation and In-Memory Records - Microsoft (medium)
  • Implement K-Means and Detect Divisible Subarrays - Microsoft (medium)
  • Sort Three Categories In Place - Microsoft (medium)
  • Retain Top K Elements - Microsoft (medium)
Microsoft logo
Microsoft
Jan 31, 2026, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
3
0
Loading...

Implement a small in-memory CSV query engine.

You are given CSV text where the first line contains column headers and each remaining line contains one data row.

Example:

Key,location,weather,temperature,data
1,"Sunnyvale","sunny",100,"datetimestamp"

Your tasks are:

  1. Parse the CSV into rows
    • Convert the CSV into a list of row objects, where each row is a dictionary mapping column_name -> value .
    • The first row defines the headers.
    • A field enclosed in double quotes must remain a string.
    • An unquoted field that contains only digits must be converted to an integer.
    • Other unquoted fields may be treated as strings.
    • Quoted fields may contain commas, so your parser must not split on commas inside double quotes.
  2. Support single-column queries
    • Given a column name, return all values from that column in row order.
    • For example, this should behave like collecting row[col_name] for every row.
  3. Support filtered projection queries
    • Implement a query function that takes:
      • a list of selected columns
      • a list of filters of the form:
        filters = [(column_name, operator, value), ...]
        
    • Supported operators are < , > , and = .
    • A row matches only if it satisfies all filters.
    • Return only the requested columns for each matching row.
    • Comparisons must work correctly for both parsed integers and strings. Use numeric comparison for integers and lexicographic comparison for strings.

You may assume the CSV input is valid and that referenced columns exist.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Microsoft•More Software Engineer•Microsoft Software Engineer•Microsoft Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.