Build a CSV Query Engine
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Implement a small in-memory CSV query engine.
You are given CSV text where the first line contains column headers and each remaining line contains one data row.
Example:
```text
Key,location,weather,temperature,data
1,"Sunnyvale","sunny",100,"datetimestamp"
```
Your tasks are:
1. **Parse the CSV into rows**
- Convert the CSV into a list of row objects, where each row is a dictionary mapping `column_name -> value`.
- The first row defines the headers.
- A field enclosed in double quotes must remain a string.
- An unquoted field that contains only digits must be converted to an integer.
- Other unquoted fields may be treated as strings.
- Quoted fields may contain commas, so your parser must not split on commas inside double quotes.
2. **Support single-column queries**
- Given a column name, return all values from that column in row order.
- For example, this should behave like collecting `row[col_name]` for every row.
3. **Support filtered projection queries**
- Implement a query function that takes:
- a list of selected columns
- a list of filters of the form:
```python
filters = [(column_name, operator, value), ...]
```
- Supported operators are `<`, `>`, and `=`.
- A row matches only if it satisfies **all** filters.
- Return only the requested columns for each matching row.
- Comparisons must work correctly for both parsed integers and strings. Use numeric comparison for integers and lexicographic comparison for strings.
You may assume the CSV input is valid and that referenced columns exist.
Quick Answer: This question evaluates proficiency in text parsing, data modeling, and implementing in-memory query semantics, including CSV parsing rules, type inference for numeric fields, and correct numeric versus lexicographic comparisons.