Implement a Formula Spreadsheet
Company: Harvey
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates expression parsing, symbolic cell reference resolution, dependency graph construction, cycle detection, and stateful API design for an in-memory spreadsheet.
Part 1: Plain Integer Spreadsheet
Constraints
- 0 <= len(operations) <= 100000
- All labels in the input are valid spreadsheet labels: uppercase letters followed by a positive row number
- Stored values are integers in the 32-bit signed range
- Only SET and GET operations appear
Examples
Input: []
Expected Output: []
Explanation: There are no operations, so there are no GET results to return.
Input: [('SET', 'A1', 10), ('GET', 'A1')]
Expected Output: [10]
Explanation: Cell A1 is set to 10, so the GET returns 10.
Input: [('GET', 'B2'), ('SET', 'B2', 7), ('GET', 'B2'), ('SET', 'B2', -3), ('GET', 'B2')]
Expected Output: ['ERROR', 7, -3]
Explanation: The first GET happens before B2 is set, so it returns 'ERROR'. After setting B2 to 7 and later to -3, the GETs return those updated values.
Input: [('SET', 'A1', 5), ('SET', 'BC12', 8), ('GET', 'BC12'), ('GET', 'A1'), ('SET', 'A1', 9), ('GET', 'A1'), ('GET', 'Z99')]
Expected Output: [8, 5, 9, 'ERROR']
Explanation: BC12 returns 8, A1 first returns 5, then after overwrite returns 9, and Z99 has never been set so it returns 'ERROR'.
Input: [('SET', 'AA10', 0), ('GET', 'AA10'), ('SET', 'AA10', 0), ('GET', 'AA10')]
Expected Output: [0, 0]
Explanation: Zero is a valid stored integer. Both GET operations read the value 0.
Hints
- A hash map from label to integer is enough for this part.
- You only need to add something to the output when you see a GET operation.
Part 2: Spreadsheet with Addition Formulas
Constraints
- 0 <= len(operations) <= 10000
- All labels in the input are valid spreadsheet labels
- Formula strings contain no spaces and use only integer literals, valid cell references, and '+'
- For this part, dependency graphs are acyclic
- A later SET can overwrite any earlier cell content
Examples
Input: [('SET', 'A1', 10), ('GET', 'A1')]
Expected Output: [10]
Explanation: A1 is set to 10, so GET A1 returns 10.
Input: [('SET', 'A1', 10), ('SET', 'A1', 30), ('GET', 'A1')]
Expected Output: [30]
Explanation: The second SET overwrites the first value.
Input: [('SET', 'A1', 7), ('SET', 'B1', 3), ('GET', 'A1'), ('GET', 'B1'), ('GET', 'C1')]
Expected Output: [7, 3, 'ERROR']
Explanation: C1 was never set, so its GET returns 'ERROR'.
Input: []
Expected Output: []
Explanation: Edge case: no operations means no outputs.
Hints
- Store the raw value for each cell. If it is a formula, evaluate it only when needed.
- Split a formula on '+' and process each token as either an integer literal or another cell reference.
Part 3: Spreadsheet with Cycle Detection
Constraints
- 0 <= len(operations) <= 10000
- All labels in the input are valid spreadsheet labels
- Formula strings contain no spaces and use only integer literals, valid cell references, and '+'
- Cycles may exist and must be reported on GET
- A later SET can overwrite any earlier cell content
Examples
Input: [('SET', 'A1', 'B1+1'), ('SET', 'B1', 'A1+1'), ('GET', 'A1')]
Expected Output: ['CYCLE']
Explanation: A1 depends on B1, and B1 depends back on A1.
Input: [('SET', 'A1', 'A1+1'), ('GET', 'A1')]
Expected Output: ['CYCLE']
Explanation: A direct self-reference is also a cycle.
Input: [('SET', 'A1', 10), ('SET', 'A2', 'A1+5'), ('GET', 'A2')]
Expected Output: [15]
Explanation: A normal acyclic dependency should still evaluate correctly.
Input: [('SET', 'A1', 'B1+1'), ('SET', 'B1', 'A1+1'), ('SET', 'C1', 5), ('GET', 'C1'), ('GET', 'A1')]
Expected Output: [5, 'CYCLE']
Explanation: Independent cells should still work even if other cells are cyclic.
Input: [('SET', 'A1', 'B1+2'), ('GET', 'A1')]
Expected Output: ['ERROR']
Explanation: A missing reference is an error, not a cycle.
Hints
- During DFS evaluation, keep a set of cells currently on the recursion stack.
- Use two states: one structure for cells already fully computed in this GET, and one for cells currently being visited.