Simulate stack traces from logs
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates stack-based simulation for computing exclusive execution time in nested function call logs and stream-processing concepts for handling equal timestamps, slight reordering, and detection of N consecutive categorical events.
Part 1: Exclusive execution time from nested logs
Constraints
- 0 <= len(logs) <= 2 * 10^5
- Each log is a tuple `(id, event, timestamp)`
- `event` is either `'START'` or `'END'`
- Timestamps are integers and may be equal
- The given log order is valid and represents properly nested calls
Examples
Input: ([(0, 'START', 0), (1, 'START', 2), (1, 'END', 5), (0, 'END', 6)],)
Expected Output: {0: 3, 1: 3}
Explanation: Function 0 runs from 0 to 2, function 1 runs from 2 to 5, then function 0 resumes from 5 to 6.
Input: ([(2, 'START', 1), (2, 'END', 4), (5, 'START', 4), (5, 'END', 7)],)
Expected Output: {2: 3, 5: 3}
Explanation: Two top-level calls run back-to-back with no nesting.
Input: ([(0, 'START', 0), (0, 'START', 2), (0, 'END', 4), (0, 'END', 5)],)
Expected Output: {0: 5}
Explanation: The same function id appears in nested recursive calls, and times must be aggregated by id.
Input: ([],)
Expected Output: {}
Explanation: Edge case: no logs means no execution time.
Input: ([(7, 'START', 10), (7, 'END', 10)],)
Expected Output: {7: 0}
Explanation: Edge case: a zero-length call contributes 0 time.
Solution
def solution(logs):
from collections import defaultdict
times = defaultdict(int)
stack = []
prev_time = None
for fid, event, timestamp in logs:
times.setdefault(fid, 0)
if prev_time is None:
prev_time = timestamp
if stack:
times[stack[-1]] += timestamp - prev_time
if event == 'START':
stack.append(fid)
elif event == 'END':
if not stack or stack[-1] != fid:
raise ValueError('Invalid log sequence')
stack.pop()
else:
raise ValueError('Unknown event type')
prev_time = timestamp
if stack:
raise ValueError('Unclosed function calls')
return dict(times)
Time complexity: O(n), where n is the number of logs. Space complexity: O(u + d), where u is the number of distinct ids and d is the maximum call-stack depth.
Hints
- Keep a stack of currently active function calls.
- When a new log at time `t` arrives, the function on top of the stack has been running since the previous timestamp.
Part 2: Online exclusive time with equal timestamps and slightly out-of-order logs
Constraints
- 0 <= len(logs) <= 2 * 10^5
- 0 <= max_delay <= len(logs)
- Each log is a tuple `(id, event, timestamp)`
- `event` is either `'START'` or `'END'`
- Each log is at most `max_delay` positions later than its correct position in the stable timestamp order
- After stable sorting by `(timestamp, arrival_index)`, the call sequence is valid and properly nested
Examples
Input: ([(0, 'START', 0), (1, 'END', 5), (1, 'START', 2), (0, 'END', 6)], 1)
Expected Output: {0: 3, 1: 3}
Explanation: The middle two logs are swapped in arrival order. A buffer of size 2 restores the chronological order.
Input: ([(0, 'START', 0), (1, 'START', 2), (1, 'END', 2), (0, 'END', 5)], 0)
Expected Output: {0: 5, 1: 0}
Explanation: Equal timestamps are allowed. Function 1 starts and ends at the same time, so it contributes 0.
Input: ([(3, 'START', 3), (2, 'START', 1), (3, 'END', 3), (2, 'END', 6)], 1)
Expected Output: {2: 5, 3: 0}
Explanation: After stable reordering by timestamp, function 2 encloses a zero-length call to function 3.
Input: ([], 3)
Expected Output: {}
Explanation: Edge case: an empty stream produces an empty result.
Solution
def solution(logs, max_delay):
from collections import defaultdict
import heapq
if max_delay < 0:
raise ValueError('max_delay must be non-negative')
times = defaultdict(int)
stack = []
prev_time = None
heap = []
def process(entry):
nonlocal prev_time
fid, event, timestamp = entry
times.setdefault(fid, 0)
if prev_time is None:
prev_time = timestamp
if stack:
times[stack[-1]] += timestamp - prev_time
if event == 'START':
stack.append(fid)
elif event == 'END':
if not stack or stack[-1] != fid:
raise ValueError('Invalid log sequence after reordering')
stack.pop()
else:
raise ValueError('Unknown event type')
prev_time = timestamp
for arrival_index, entry in enumerate(logs):
timestamp = entry[2]
heapq.heappush(heap, (timestamp, arrival_index, entry))
if len(heap) > max_delay + 1:
_, _, smallest = heapq.heappop(heap)
process(smallest)
while heap:
_, _, smallest = heapq.heappop(heap)
process(smallest)
if stack:
raise ValueError('Unclosed function calls')
return dict(times)
Time complexity: O(n log(max_delay + 1)), where n is the number of logs. Space complexity: O(max_delay + d + u), where d is the maximum corrected call-stack depth and u is the number of distinct ids.
Hints
- If every item is at most `k` positions late, a min-heap of size `k + 1` can emit the next correct item online.
- Once a corrected log is emitted, the exclusive-time accounting is the same stack-based idea as in Part 1.
Part 3: Earliest run of N identical consecutive events
Constraints
- 0 <= len(events) <= 2 * 10^5
- 1 <= N <= 2 * 10^5
- Event values only need to support equality comparison
- Indices are 0-based
Examples
Input: (['INFO', 'WARN', 'WARN', 'WARN', 'ERROR'], 3)
Expected Output: 1
Explanation: The earliest run of length 3 is `'WARN', 'WARN', 'WARN'`, starting at index 1.
Input: ([1, 1, 2, 2, 2, 2], 4)
Expected Output: 2
Explanation: The value 2 appears four times consecutively starting at index 2.
Input: (['A', 'A', 'B', 'B'], 3)
Expected Output: -1
Explanation: No value appears 3 times in a row.
Input: ([], 1)
Expected Output: -1
Explanation: Edge case: an empty stream cannot contain any run.
Input: (['X', 'Y'], 1)
Expected Output: 0
Explanation: Edge case: when `N = 1`, the first event already forms a valid window.
Solution
def solution(events, N):
if N <= 0:
raise ValueError('N must be positive')
run_value = None
run_length = 0
for i, value in enumerate(events):
if value == run_value:
run_length += 1
else:
run_value = value
run_length = 1
if run_length == N:
return i - N + 1
return -1
Time complexity: O(n), where n is the number of events. Space complexity: O(1).
Hints
- You do not need a sliding-window frequency map; only the current run matters.
- As soon as the current run length reaches `N`, you have found the earliest valid start.