Find the Slowest Function from Profiler Events
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates the ability to interpret time-ordered profiler event logs, compute inclusive and exclusive function runtimes, reason about call-stack semantics, and recognize anomalies such as recursion, cycles, missing end events, or non-returning functions.
Part 1: Compute Total Inclusive Running Time for Each Function
Constraints
- 0 <= len(events) <= 200000
- 0 <= timestamp <= 10^9
- The trace is properly nested and every 'START' has a matching 'END'
Examples
Input: [('START', 'a', 0), ('START', 'b', 2), ('END', 'b', 5), ('START', 'c', 6), ('END', 'c', 10), ('END', 'a', 12)]
Expected Output: {'a': 12, 'b': 3, 'c': 4}
Explanation: Function 'a' runs from 0 to 12 inclusive of child calls, so its inclusive time is 12. Functions 'b' and 'c' contribute 3 and 4.
Input: [('START', 'x', 1), ('END', 'x', 4), ('START', 'x', 5), ('END', 'x', 9)]
Expected Output: {'x': 7}
Explanation: The same function is called twice. Total inclusive time is (4 - 1) + (9 - 5) = 7.
Input: []
Expected Output: {}
Explanation: An empty log produces an empty timing table.
Input: [('START', 'm', 7), ('END', 'm', 7)]
Expected Output: {'m': 0}
Explanation: A call that starts and ends at the same timestamp has duration 0 under the end - start convention.
Hints
- Use a stack to remember the active calls and their start times.
- When you see an 'END', pop the matching 'START' and add end - start to that function's total.
Part 2: Compute Total Exclusive Running Time for Each Function
Constraints
- 0 <= len(events) <= 200000
- 0 <= timestamp <= 10^9
- The trace is properly nested and every 'START' has a matching 'END'
Examples
Input: [('START', 'a', 0), ('START', 'b', 2), ('END', 'b', 5), ('START', 'c', 6), ('END', 'c', 10), ('END', 'a', 12)]
Expected Output: {'a': 5, 'b': 3, 'c': 4}
Explanation: Function 'a' gets time [0,2), [5,6), and [10,12), for a total of 5. Child functions contribute their own exclusive times.
Input: [('START', 'main', 1), ('END', 'main', 9)]
Expected Output: {'main': 8}
Explanation: With no child calls, inclusive and exclusive times are the same.
Input: [('START', 'a', 0), ('END', 'a', 2), ('START', 'b', 2), ('END', 'b', 5), ('START', 'a', 5), ('END', 'a', 8)]
Expected Output: {'a': 5, 'b': 3}
Explanation: Function 'a' runs in two separate calls: 2 + 3 = 5. Function 'b' runs for 3.
Input: []
Expected Output: {}
Explanation: No events means no exclusive time for any function.
Input: [('START', 'a', 0), ('START', 'b', 2), ('END', 'b', 2), ('END', 'a', 5)]
Expected Output: {'a': 5, 'b': 0}
Explanation: Child 'b' has zero duration. Function 'a' still gets all 5 time units outside that zero-length child.
Hints
- Track the currently running function with a stack.
- Keep a prev_time pointer: when a new event happens, the function currently on top of the stack gets the time since prev_time.
Part 3: Return the Slowest Function by Exclusive Time
Constraints
- 0 <= len(events) <= 200000
- 0 <= timestamp <= 10^9
- The trace is properly nested and every 'START' has a matching 'END'
Examples
Input: [('START', 'a', 0), ('START', 'b', 2), ('END', 'b', 5), ('START', 'c', 6), ('END', 'c', 10), ('END', 'a', 12)]
Expected Output: {'max_exclusive_time': 5, 'functions': ['a']}
Explanation: Exclusive times are a=5, b=3, c=4, so 'a' is the unique slowest function.
Input: [('START', 'a', 0), ('END', 'a', 3), ('START', 'b', 5), ('END', 'b', 8)]
Expected Output: {'max_exclusive_time': 3, 'functions': ['a', 'b']}
Explanation: Both functions have exclusive time 3, so both are returned in sorted order.
Input: [('START', 'main', 0), ('START', 'io', 1), ('END', 'io', 6), ('END', 'main', 8)]
Expected Output: {'max_exclusive_time': 5, 'functions': ['io']}
Explanation: Function 'main' has exclusive time 3, while 'io' has 5.
Input: []
Expected Output: {'max_exclusive_time': 0, 'functions': []}
Explanation: With no events, there is no slowest function.
Input: [('START', 'f', 0), ('END', 'f', 2), ('START', 'g', 2), ('END', 'g', 6), ('START', 'f', 6), ('END', 'f', 10)]
Expected Output: {'max_exclusive_time': 6, 'functions': ['f']}
Explanation: Function 'f' runs for 2 and then 4, totaling 6 exclusive time, which exceeds 'g'.
Hints
- You can reuse the exclusive-time stack algorithm from the previous part.
- After building the exclusive-time table, scan it once to find the maximum and collect all tied names.
Part 4: Detect and Handle Malformed Profiler Traces
Constraints
- 0 <= len(events) <= 200000
- 0 <= timestamp <= observation_end <= 10^9
- The trace may contain recursion, indirect call-graph cycles, mismatched END events, and unfinished calls
Examples
Input: ([('START', 'a', 0), ('START', 'b', 1), ('END', 'b', 3), ('END', 'a', 4)], 4)
Expected Output: {'recursive_functions': [], 'cycle_functions': [], 'missing_end_functions': [], 'mismatched_end_events': 0, 'suspected_non_returning': None}
Explanation: This is a clean trace with no malformed behavior.
Input: ([('START', 'a', 0), ('START', 'a', 1), ('END', 'a', 2), ('END', 'a', 3)], 3)
Expected Output: {'recursive_functions': ['a'], 'cycle_functions': [], 'missing_end_functions': [], 'mismatched_end_events': 0, 'suspected_non_returning': None}
Explanation: Function 'a' starts while already active, so recursion is detected. This problem reports direct recursion separately from multi-function call-graph cycles.
Input: ([('START', 'a', 0), ('START', 'b', 1), ('END', 'b', 2), ('END', 'a', 3), ('START', 'b', 4), ('START', 'a', 5), ('END', 'a', 6), ('END', 'b', 7)], 7)
Expected Output: {'recursive_functions': [], 'cycle_functions': ['a', 'b'], 'missing_end_functions': [], 'mismatched_end_events': 0, 'suspected_non_returning': None}
Explanation: The observed call graph contains both a -> b and b -> a, so 'a' and 'b' belong to a cycle.
Input: ([('START', 'main', 0), ('START', 'worker', 2)], 10)
Expected Output: {'recursive_functions': [], 'cycle_functions': [], 'missing_end_functions': ['main', 'worker'], 'mismatched_end_events': 0, 'suspected_non_returning': 'worker'}
Explanation: Both calls are still open when observation ends, and the deepest active function is 'worker'.
Input: ([('START', 'a', 0), ('START', 'b', 1), ('END', 'a', 2), ('END', 'b', 3), ('END', 'a', 4)], 4)
Expected Output: {'recursive_functions': [], 'cycle_functions': [], 'missing_end_functions': [], 'mismatched_end_events': 1, 'suspected_non_returning': None}
Explanation: The first END for 'a' is mismatched because 'b' is on top of the stack, so it is counted and ignored. The later valid END events then close the stack correctly.
Hints
- Use one stack for the active call frames, and a graph of caller -> callee edges collected from 'START' events.
- Recursion is about the current stack, while call-graph cycles are about the whole observed graph. Strongly connected components are a clean way to find graph cycles.