Find LCA in a N-ary tree via DFS
Company: Scale AI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of tree algorithms and recursive DFS traversal for computing the lowest common ancestor in an N-ary tree, including handling node references, subtree detection, and correctness in a single traversal.
Constraints
- 1 <= number of nodes <= 10^5
- The tree may be highly unbalanced
- Node ids are unique, but node values may repeat
- p_id and q_id are distinct and both exist in the tree
Examples
Input: ((1, 100, [(2, 200, [(5, 500, []), (6, 600, [])]), (3, 300, []), (4, 400, [(7, 700, []), (8, 800, [])])]), 5, 8)
Expected Output: 1
Explanation: Node 5 is in the subtree of 2 and node 8 is in the subtree of 4, so their lowest common ancestor is the root node 1.
Input: ((1, 100, [(2, 200, [(5, 500, []), (6, 600, [])]), (3, 300, []), (4, 400, [(7, 700, []), (8, 800, [])])]), 5, 6)
Expected Output: 2
Explanation: Nodes 5 and 6 are both direct children of node 2, so node 2 is their LCA.
Input: ((1, 100, [(2, 200, [(5, 500, []), (6, 600, [])]), (3, 300, []), (4, 400, [(7, 700, []), (8, 800, [])])]), 2, 6)
Expected Output: 2
Explanation: A node can be an ancestor of itself. Since node 2 is an ancestor of node 6, the LCA is 2.
Input: ((10, 7, [(20, 7, []), (30, 7, [(40, 7, []), (50, 7, [])])]), 40, 50)
Expected Output: 30
Explanation: All node values are duplicated, so ids must be used. Nodes 40 and 50 are both under node 30, making 30 the LCA.
Input: (None, 1, 2)
Expected Output: None
Explanation: An empty tree has no nodes, so no LCA exists.
Input: ((1, 9, [(2, 9, []), (3, 9, [])]), 2, 99)
Expected Output: None
Explanation: Node 99 does not appear in the tree, so there is no valid LCA.
Solution
def solution(tree, p_id, q_id):
if tree is None:
return None
root_id = tree[0]
res = {}
lca_id = None
stack = [(tree, False)] # (node, visited)
while stack:
node, visited = stack.pop()
if node is None:
continue
node_id, _, children = node
if not visited:
stack.append((node, True))
for child in children:
stack.append((child, False))
else:
found_p = (node_id == p_id)
found_q = (node_id == q_id)
for child in children:
cid = child[0]
cp, cq = res.get(cid, (False, False))
if cp:
found_p = True
if cq:
found_q = True
if lca_id is None and found_p and found_q:
lca_id = node_id
res[node_id] = (found_p, found_q)
rp, rq = res.get(root_id, (False, False))
return lca_id if rp and rq else None
Time complexity: O(n). Space complexity: O(h), where h is the height of the tree due to recursion stack.
Hints
- Let each DFS call return two booleans: whether p was found in the current subtree and whether q was found in the current subtree.
- Use postorder traversal. If children are processed before the current node, the first node whose returned pair becomes (True, True) is the deepest common ancestor.