Spreadsheet Formula Engine with Cycle Detection
Company: Harvey
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates a candidate's ability to design a stateful data structure that models dependency relationships and lazily propagates updates, a common theme in coding and algorithms interviews. It tests skills in graph traversal, cycle detection, and choosing between eager versus lazy evaluation to optimize for a given read/write workload. This is a practical, implementation-level problem assessing system design intuition within a single class rather than pure theoretical knowledge.
Constraints
- At most 10^4 distinct cells.
- At most 10^5 total set_cell / get_cell calls.
- Every integer value and every evaluated cell value fits in a signed 64-bit integer.
- Cell ids are one or more uppercase letters followed by one or more digits (A1, B2, AA10).
- Formula terms are separated by '+' with no spaces; each term is a non-negative integer or a cell reference.
- Formula references may point to cells that have not been set yet; treat those as 0.
- Every formula passed in is syntactically valid.
Examples
Input: (['Spreadsheet','set_cell','set_cell','set_cell','get_cell','set_cell','get_cell','set_cell','get_cell','get_cell','set_cell','get_cell'], [[], ['A1',5], ['A2',10], ['A3','=A1+A2'], ['A3'], ['A1',7], ['A3'], ['A4','=A3+3'], ['A4'], ['B9'], ['A1','=A4'], ['A1']])
Expected Output: [None, None, None, None, 15, None, 17, None, 20, 0, 'ValueError', 7]
Explanation: Full walkthrough from the prompt: lazy formula eval reflects the latest A1 (15 then 17 after A1->7); A4=A3+3=20; unset B9=0; the cyclic write A1=A4 raises ValueError and leaves A1 at 7.
Input: (['Spreadsheet','set_cell','get_cell','set_cell','get_cell','set_cell','get_cell'], [[], ['X1','=X1'], ['X1'], ['X1','=5+10'], ['X1'], ['Y2','=B9+5'], ['Y2']])
Expected Output: [None, 'ValueError', 0, None, 15, None, 5]
Explanation: Self-reference X1='=X1' is an immediate cycle (ValueError, X1 stays unset -> 0). Pure-literal formula '=5+10'=15. '=B9+5' references unset B9 (treated as 0) -> 5.
Input: (['Spreadsheet','set_cell','set_cell','get_cell','set_cell','get_cell','set_cell','get_cell'], [[], ['A1',10], ['A2','=A1+1'], ['A2'], ['A2',100], ['A2'], ['A1',50], ['A2']])
Expected Output: [None, None, None, 11, None, 100, None, 100]
Explanation: A2 depends on A1 (=11). Overwriting A2 with the literal 100 drops the dependency edge, so a later A1->50 no longer affects A2 (still 100).
Input: (['Spreadsheet','set_cell','set_cell','set_cell','get_cell','set_cell','get_cell','get_cell'], [[], ['A1',1], ['B1','=A1+1'], ['C1','=B1+1'], ['C1'], ['A1','=C1'], ['A1'], ['C1']])
Expected Output: [None, None, None, None, 3, 'ValueError', 1, 3]
Explanation: Transitive chain A1<-B1<-C1 gives C1=3. Setting A1='=C1' closes the cycle A1->C1->B1->A1, so it raises ValueError and leaves the sheet unchanged (A1=1, C1=3).
Input: (['Spreadsheet','get_cell'], [[], ['Z9']])
Expected Output: [None, 0]
Explanation: Boundary: reading a cell that was never set returns 0.
Input: (['Spreadsheet','set_cell','set_cell','get_cell'], [[], ['AA10',3], ['B2','=AA10+AA10+4'], ['B2']])
Expected Output: [None, None, None, 10]
Explanation: Multi-letter cell id (AA10) and a repeated reference: B2 = AA10 + AA10 + 4 = 3+3+4 = 10.
Hints
- Store each cell as either a literal integer or a parsed formula (a list of terms). Do not evaluate formulas on write — the workload is write-heavy, so defer evaluation to get_cell.
- Maintain a dependency graph: for a formula cell, record the set of cells it references. get_cell then recursively sums literal terms and the current values of referenced cells (unset cells resolve to 0).
- Before committing a formula write to cell C with reference set R, check whether C is reachable from any cell in R through the existing dependency edges (DFS/BFS). If so, the edges C->R would close a cycle: raise ValueError and leave every stored cell untouched.
- A literal write can never create a cycle (it removes C's outgoing edges), so only formula writes need the cycle check.