Design Spreadsheet Formula Cells
Company: Harvey
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates ability to design and implement data structures and algorithms for expression parsing, dependency tracking, incremental evaluation, and cycle detection in an in-memory spreadsheet.
Part 1: Evaluate Spreadsheet Cells Without Cycle Detection
Constraints
- 0 <= len(operations) <= 10^4
- cellName consists of uppercase letters followed by digits, for example A1 or BC12
- Integer literals fit in 32-bit signed range
- Formulas contain only cell references, integer constants, and '+'
- For this part, the operations never create a circular dependency
Examples
Input: ([('SET', 'A1', 10), ('GET', 'A1')],)
Expected Output: [10]
Explanation: A1 is assigned a literal value, so GET A1 returns 10.
Input: ([('SET', 'A1', 'B1 + C1 + 5'), ('GET', 'A1'), ('SET', 'B1', 3), ('GET', 'A1'), ('SET', 'C1', 'B1 + 2'), ('GET', 'A1')],)
Expected Output: [5, 8, 13]
Explanation: Unset cells count as 0 at first, then later updates to B1 and C1 change A1's evaluated value.
Input: ([('SET', 'B1', 1), ('SET', 'C1', 'B1 + 2'), ('SET', 'A1', 'B1 + C1'), ('GET', 'A1'), ('SET', 'B1', 5), ('GET', 'A1')],)
Expected Output: [4, 12]
Explanation: A1 depends on both B1 and C1, and C1 also depends on B1, so changing B1 affects both parts of A1.
Input: ([('GET', 'Z99')],)
Expected Output: [0]
Explanation: Edge case: an unset cell evaluates to 0.
Hints
- Store each cell's raw content instead of eagerly updating every dependent cell.
- When evaluating a GET, use DFS recursion with a per-query memo dictionary so repeated references are only computed once.
Part 2: Reject Circular Spreadsheet Formulas
Constraints
- 0 <= len(operations) <= 10^4
- cellName consists of uppercase letters followed by digits
- Integer literals fit in 32-bit signed range
- Formulas contain only cell references, integer constants, and '+'
- If a SET is rejected, the previous content of that cell must remain unchanged
Examples
Input: ([('SET', 'A1', 10), ('GET', 'A1'), ('SET', 'B1', 'A1 + 5'), ('GET', 'B1')],)
Expected Output: [True, 10, True, 15]
Explanation: Both SET operations are valid. B1 evaluates to A1 + 5 = 15.
Input: ([('SET', 'A1', 'B1 + 1'), ('SET', 'C1', 'A1 + 2'), ('SET', 'B1', 'C1'), ('GET', 'A1'), ('GET', 'B1'), ('GET', 'C1')],)
Expected Output: [True, True, False, 1, 0, 3]
Explanation: The third SET would create B1 -> C1 -> A1 -> B1, so it is rejected. B1 stays unset and therefore equals 0.
Input: ([('SET', 'A1', 7), ('SET', 'A1', 'A1 + 1'), ('GET', 'A1')],)
Expected Output: [True, False, 7]
Explanation: A direct self-reference is a cycle, so the second SET is rejected and A1 keeps its old value.
Input: ([('GET', 'Z9')],)
Expected Output: [0]
Explanation: Edge case: reading an unset cell returns 0.
Hints
- Extract the referenced cells from a formula and think of them as directed edges from the assigned cell to its dependencies.
- When setting cell X, the update is invalid exactly when some new dependency can already reach X in the current dependency graph.