Parse and Reconstruct Stack Trace
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates parsing and string-processing skills, hierarchical data structure design for call-tree reconstruction, exception trace analysis, and algorithmic complexity reasoning for robust log handling.
Part 1: Extract Frames from the Outermost Stack Trace Section
Constraints
- 0 <= len(trace) <= 200000
- The trace contains at most 10000 lines.
- A valid frame line must start with 'at ' after trimming and must end with '(file:line)' where line is a non-negative integer.
- Only the first exception section should be parsed; later 'Caused by:' sections are ignored.
Examples
Input: 'RuntimeError: boom\nat App::run(app.py:10)\nat Main::main(main.py:20)'
Expected Output: ['App::run(app.py:10)', 'Main::main(main.py:20)']
Explanation: Both frame lines are valid and belong to the outermost section.
Input: 'ValueError\nrandom text\n at Parser::parse(parser.py:7)\nat BadLine(nope)\nCaused by: IOError\nat IO::read(io.py:3)'
Expected Output: ['Parser::parse(parser.py:7)']
Explanation: The malformed line is skipped, and parsing stops before the cause section.
Input: ''
Expected Output: []
Explanation: Empty input has no frames.
Input: 'TopError\nCaused by: Inner\nat X::f(x.py:1)'
Expected Output: []
Explanation: There are no outermost frames before the first cause section.
Input: 'Oops\n at One::only(one.py:1) '
Expected Output: ['One::only(one.py:1)']
Explanation: Leading and trailing spaces should be ignored when validating a frame line.
Solution
def solution(trace):
def parse_frame(line):
stripped = line.strip()
if not stripped.startswith('at '):
return None
body = stripped[3:].strip()
if not body.endswith(')'):
return None
open_paren = body.rfind('(')
colon = body.rfind(':')
if open_paren == -1 or colon == -1 or colon < open_paren or colon >= len(body) - 1:
return None
line_no = body[colon + 1:-1]
if not line_no.isdigit() or not body[:open_paren]:
return None
return body
if not trace:
return []
frames = []
for line in trace.splitlines():
if line.strip().startswith('Caused by:'):
break
frame = parse_frame(line)
if frame is not None:
frames.append(frame)
return framesTime complexity: O(n), where n is the number of characters in the trace.. Space complexity: O(f), where f is the number of extracted frames..
Hints
- Process the trace line by line and stop at the first 'Caused by:' line.
- You do not need a full parser; checking the position of '(', ':', and ')' is enough to validate a frame.
Part 2: Reconstruct a Merged Call Hierarchy Tree
Constraints
- 0 <= len(trace) <= 200000
- The trace contains at most 10000 lines.
- Each section contains at most one abbreviation line of the form '... N more'.
- A valid frame line must start with 'at ' after trimming and must end with '(file:line)' where line is a non-negative integer.
Examples
Input: 'Err\nat B::b(b.py:2)\nat A::a(a.py:1)'
Expected Output: {'A::a(a.py:1)': {'B::b(b.py:2)': {}}}
Explanation: A single stack becomes a single root-to-leaf path from oldest to newest.
Input: 'Outer\nat Service::run(s.py:30)\nat Controller::handle(c.py:20)\nat Main::main(m.py:10)\nCaused by: ParseError\nat Parser::parse(p.py:5)\n... 2 more'
Expected Output: {'Main::main(m.py:10)': {'Controller::handle(c.py:20)': {'Service::run(s.py:30)': {}, 'Parser::parse(p.py:5)': {}}}}
Explanation: The cause expands to [Parser::parse, Controller::handle, Main::main], creating a branch under Controller::handle.
Input: 'Top\nat A1::f(a1.py:4)\nat Root::main(root.py:1)\nCaused by: Mid\nat B1::g(b1.py:3)\n... 1 more\nCaused by: Low\nat C1::h(c1.py:2)\n... 2 more'
Expected Output: {'Root::main(root.py:1)': {'A1::f(a1.py:4)': {}, 'B1::g(b1.py:3)': {'C1::h(c1.py:2)': {}}}}
Explanation: Each deeper cause is expanded from the previous section, producing a shared prefix and a longer branch.
Input: ''
Expected Output: {}
Explanation: An empty trace has no hierarchy.
Input: 'E\nnoise\nat X::x(x.py:2)\nat Main::main(main.py:1)\nCaused by: Y\nbad\n... 5 more'
Expected Output: {'Main::main(main.py:1)': {'X::x(x.py:2)': {}}}
Explanation: The malformed line is ignored, and the oversized abbreviation copies only the available parent suffix.
Solution
def solution(trace):
def parse_frame(line):
stripped = line.strip()
if not stripped.startswith('at '):
return None
body = stripped[3:].strip()
if not body.endswith(')'):
return None
open_paren = body.rfind('(')
colon = body.rfind(':')
if open_paren == -1 or colon == -1 or colon < open_paren or colon >= len(body) - 1:
return None
line_no = body[colon + 1:-1]
if not line_no.isdigit() or not body[:open_paren]:
return None
return body
def parse_more(line):
stripped = line.strip()
if not stripped.startswith('... ') or not stripped.endswith(' more'):
return None
middle = stripped[4:-5].strip()
if not middle.isdigit():
return None
return int(middle)
if not trace:
return {}
sections = []
current = {'frames': [], 'omitted': 0}
for line in trace.splitlines():
if line.strip().startswith('Caused by:'):
sections.append(current)
current = {'frames': [], 'omitted': 0}
continue
frame = parse_frame(line)
if frame is not None:
current['frames'].append(frame)
continue
omitted = parse_more(line)
if omitted is not None:
current['omitted'] = omitted
sections.append(current)
full_sections = []
for i, section in enumerate(sections):
frames = list(section['frames'])
omitted = section['omitted']
if omitted > 0 and i > 0:
parent = full_sections[i - 1]
k = min(omitted, len(parent))
if k > 0:
frames.extend(parent[-k:])
full_sections.append(frames)
forest = {}
for frames in full_sections:
if not frames:
continue
node = forest
for frame in reversed(frames):
if frame not in node:
node[frame] = {}
node = node[frame]
return forestTime complexity: O(n + T), where n is the number of characters and T is the total number of frames across all reconstructed sections.. Space complexity: O(T) for the reconstructed sections and merged tree..
Hints
- First reconstruct each section's full frame list; only then build the hierarchy.
- A trie built from oldest to most recent frames naturally merges shared call prefixes.
Part 3: Find the Deepest Frame of the Thrown Exception
Constraints
- 0 <= len(trace) <= 200000
- The trace contains at most 10000 lines.
- Each section contains at most one abbreviation line of the form '... N more'.
- A valid frame line must start with 'at ' after trimming and must end with '(file:line)' where line is a non-negative integer.
Examples
Input: 'Error\nat B::work(b.py:2)\nat A::main(a.py:1)'
Expected Output: 'B::work(b.py:2)'
Explanation: With no nested cause, the thrown exception is the outermost section, and its deepest node is the most recent frame.
Input: 'Outer\nat Service::run(s.py:3)\nat Main::main(m.py:1)\nCaused by: Inner\nat Parser::parse(p.py:2)\n... 1 more'
Expected Output: 'Parser::parse(p.py:2)'
Explanation: The deepest cause expands to [Parser::parse, Main::main], so the deepest node is Parser::parse.
Input: 'Top\nat A::a(a.py:4)\nat Root::main(root.py:1)\nCaused by: Mid\nat B::b(b.py:3)\n... 1 more\nCaused by: Low\nat C::c(c.py:2)\n... 2 more'
Expected Output: 'C::c(c.py:2)'
Explanation: The final cause section reconstructs to [C::c, B::b, Root::main], and the deepest node is C::c.
Input: ''
Expected Output: None
Explanation: Empty input cannot produce a frame.
Input: 'Outer\nat X::x(x.py:2)\nat Main::main(m.py:1)\nCaused by: Inner\nbad line\n... 2 more'
Expected Output: 'X::x(x.py:2)'
Explanation: The deepest cause has no valid explicit frame, but expansion recovers [X::x, Main::main].
Solution
def solution(trace):
def parse_frame(line):
stripped = line.strip()
if not stripped.startswith('at '):
return None
body = stripped[3:].strip()
if not body.endswith(')'):
return None
open_paren = body.rfind('(')
colon = body.rfind(':')
if open_paren == -1 or colon == -1 or colon < open_paren or colon >= len(body) - 1:
return None
line_no = body[colon + 1:-1]
if not line_no.isdigit() or not body[:open_paren]:
return None
return body
def parse_more(line):
stripped = line.strip()
if not stripped.startswith('... ') or not stripped.endswith(' more'):
return None
middle = stripped[4:-5].strip()
if not middle.isdigit():
return None
return int(middle)
if not trace:
return None
sections = []
current = {'frames': [], 'omitted': 0}
for line in trace.splitlines():
if line.strip().startswith('Caused by:'):
sections.append(current)
current = {'frames': [], 'omitted': 0}
continue
frame = parse_frame(line)
if frame is not None:
current['frames'].append(frame)
continue
omitted = parse_more(line)
if omitted is not None:
current['omitted'] = omitted
sections.append(current)
full_sections = []
for i, section in enumerate(sections):
frames = list(section['frames'])
omitted = section['omitted']
if omitted > 0 and i > 0:
parent = full_sections[i - 1]
k = min(omitted, len(parent))
if k > 0:
frames.extend(parent[-k:])
full_sections.append(frames)
deepest_section = full_sections[-1] if full_sections else []
if not deepest_section:
return None
return deepest_section[0]Time complexity: O(n + T), where n is the number of characters and T is the total number of frames across all reconstructed sections.. Space complexity: O(T)..
Hints
- The thrown exception is the last section, not the first one.
- To expand '... N more', reuse the suffix of the previous section's reconstructed frame list.