Build a Pivot Table Aggregator
Company: Sigmacomputing
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates a candidate's ability to implement data aggregation and efficient lookup structures, focusing on grouping, associative aggregations, and maintaining intermediate state for derived metrics.
Part 1: Build a Sum-Based Pivot Table Aggregator
Constraints
- 0 <= len(records) <= 200000
- row_key and column_key are strings
- value is an integer or float
- Only row/column pairs that appear in the input should appear in the output
Examples
Input: ([('A', 'X', 10), ('A', 'X', 20), ('A', 'Y', 5), ('B', 'X', 7)],)
Expected Output: {'A': {'X': 30, 'Y': 5}, 'B': {'X': 7}}
Explanation: The pair ('A', 'X') appears twice, so its values are added: 10 + 20 = 30.
Input: ([],)
Expected Output: {}
Explanation: Edge case: no records means no pivot table entries.
Input: ([('R', 'C', 42)],)
Expected Output: {'R': {'C': 42}}
Explanation: A single record becomes one row with one column.
Input: ([('A', 'X', 5), ('A', 'X', -2), ('A', 'Y', 0), ('B', 'Z', 3), ('B', 'Z', 4)],)
Expected Output: {'A': {'X': 3, 'Y': 0}, 'B': {'Z': 7}}
Explanation: The aggregator should correctly handle repeated cells, zero values, and negative values.
Hints
- A hash map (dictionary) is a natural fit for fast grouping and lookup.
- Think of the result as a dictionary of dictionaries: first by row, then by column.
Part 2: Extend the Pivot Table Aggregator to Support Sum, Average, and Minimum
Constraints
- 0 <= len(records) <= 200000
- row_key and column_key are strings
- value is an integer or float
- agg_func is one of: "sum", "average", "minimum"
- Only row/column pairs that appear in the input should appear in the output
Examples
Input: ([('A', 'X', 10), ('A', 'X', 20), ('A', 'Y', 5), ('B', 'X', 7)], 'average')
Expected Output: {'A': {'X': 15.0, 'Y': 5.0}, 'B': {'X': 7.0}}
Explanation: For ('A', 'X'), average = (10 + 20) / 2 = 15.0.
Input: ([('A', 'X', 10), ('A', 'X', 20), ('A', 'Y', 5), ('B', 'X', 7)], 'minimum')
Expected Output: {'A': {'X': 10, 'Y': 5}, 'B': {'X': 7}}
Explanation: For each cell, keep the smallest value seen.
Input: ([('A', 'X', 10), ('A', 'X', 20), ('A', 'Y', 5), ('B', 'X', 7)], 'sum')
Expected Output: {'A': {'X': 30, 'Y': 5}, 'B': {'X': 7}}
Explanation: This matches the basic pivot table summation behavior.
Input: ([], 'minimum')
Expected Output: {}
Explanation: Edge case: an empty input should return an empty dictionary regardless of the aggregation function.
Input: ([('A', 'X', 2), ('A', 'X', 4), ('A', 'X', 6), ('B', 'Y', -3)], 'average')
Expected Output: {'A': {'X': 4.0}, 'B': {'Y': -3.0}}
Explanation: The average of 2, 4, and 6 is 4.0. A single value still produces a float average.
Hints
- For average, each cell needs more than one number while processing. What two pieces of information are enough?
- You can maintain one nested dictionary for state while scanning the records, then convert that state into the final pivot table.