PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of sampled stack traces, event reconstruction, handling ambiguous or incomplete profiling data, and algorithmic reasoning about correctness and time/space complexity.

  • Medium
  • Anthropic
  • Coding & Algorithms
  • Machine Learning Engineer

Convert stack samples to execution trace

Company: Anthropic

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

You are given sampling-profiler output: a list of Sample objects ordered by timestamp ascending. Each Sample has (t: float, stack: list[str]) where stack[0] is the outermost currently executing function and stack[-1] is the innermost. An arbitrary number of calls/returns may occur between samples. Convert this into a trace of start/end events suitable for visualization. Define precise rules to infer events between consecutive samples using the longest common prefix of stacks: emit end events for frames present in the earlier stack but not in the later one (from deepest up to the divergence point), then emit start events for frames present in the later stack but not in the earlier one (from the divergence point down to deepest). Also handle the initial sample (emit starts for its whole stack) and the finalization at the end of the last sample if required. Implement a function samples_to_trace(samples) -> list[tuple[float, str, str]] that returns events as (timestamp, 's'|'e', function_name) in the correct temporal and nesting order. Discuss how you would: 1) treat duplicate function names appearing at different depths, 2) compress consecutive identical samples, 3) handle empty stacks, invalid inputs, or equal timestamps, and 4) optionally add a synthetic root to simplify logic. Provide time and space complexity and justify correctness. Example to ground the rules: for samples [Sample(0.0, ['a','b','a','c']), Sample(1.0, ['a','a','b','c'])], the emitted sequence should reflect at t=0.0: start a, start b, start a, start c; at t=1.0: end c, end a, end b; then start a, start b, start c (or an equivalent ordering consistent with your rules).

Quick Answer: This question evaluates understanding of sampled stack traces, event reconstruction, handling ambiguous or incomplete profiling data, and algorithmic reasoning about correctness and time/space complexity.

Convert sampling-profiler stack snapshots into a flat, properly nested trace of function **start** and **end** events suitable for a flame-graph–style visualization. ## What to implement Implement `solution(samples)` (equivalent to `samples_to_trace(samples)`). - **Input** — `samples` is a list (or tuple) of snapshots ordered by timestamp ascending. Each snapshot is a 2-item pair `(t, stack)`: - `t` is a numeric timestamp (`int` or `float`). - `stack` is a list of function names (strings). `stack[0]` is the **outermost** active function and `stack[-1]` is the **innermost**. - **Output** — a list of events. Each event is a tuple `(timestamp, kind, function_name)` where: - `timestamp` is the event's time as a **float** (each input `t` is coerced to `float` in the output, so an integer input timestamp appears as a float). - `kind` is `'s'` (the function starts) or `'e'` (the function ends). ## Rules Process the samples in input order and emit events using **exactly** these rules: 1. **First sample.** Emit a start event (`'s'`) for every frame in its stack, from **outermost to innermost**, all at that sample's timestamp. 2. **Each transition.** For each consecutive pair of samples `(prev_t, prev_stack)` → `(curr_t, curr_stack)`: - Compute the **longest common prefix** length `k`: scan the two stacks position by position and stop at the first index where the frames differ (or where one stack runs out). Frames are matched **only by position**. - Emit end events (`'e'`) at `curr_t` for the frames in `prev_stack[k:]`, ordered **deepest frame first**, up to the divergence point. - Then emit start events (`'s'`) at `curr_t` for the frames in `curr_stack[k:]`, ordered from the **divergence point down to the deepest frame**. 3. **End of capture.** The capture is assumed to end at the last sample's timestamp. After all transitions, close every frame still open in the **last** stack by emitting end events, **deepest first**, all at the last sample's timestamp. ## Important details - **Duplicate names are distinct frames.** The same function name at different depths is a different frame; match frames only by position when computing the longest common prefix. - **Empty stacks are allowed** and mean no function is active at that sample. - **Identical consecutive stacks produce no events.** When two adjacent stacks are equal, `k` equals their length, so both the end and start loops are empty (an implicit compression step). - **Equal timestamps are allowed** and must be processed in the given input order. - **Empty input** (`samples` is empty) returns an empty list. ## Validation Raise `ValueError` for malformed input, specifically when: - `samples` is not a list or tuple. - A sample is not a 2-item sequence `(timestamp, stack)`. - A timestamp is not a number (note: `bool` is **not** accepted as a number). - A stack is not a `list`. - Any frame name is not a string. - Timestamps decrease (each timestamp must be **≥** the previous one; equal timestamps are fine). ## Why this works The longest common prefix is exactly the portion of the call stack still active across both samples. Everything below that prefix in the earlier sample must have ended, and everything below that prefix in the later sample must have started. Emitting ends bottom-up and starts top-down keeps every frame properly nested. > You may imagine a synthetic root frame that is always present, but do **not** emit any events for it. ## Constraints - `0 <= len(samples) <= 200000` - `0 <= len(stack_i) <= 100000`, and the sum of all stack lengths is at most `400000` - Timestamps must be non-decreasing; equal timestamps are allowed - Each stack element must be a string; malformed input must raise `ValueError`

Constraints

  • 0 <= len(samples) <= 200000
  • 0 <= len(stack_i) <= 100000, and the sum of all stack lengths is at most 400000
  • Timestamps must be non-decreasing; equal timestamps are allowed
  • Each stack element must be a string; malformed input should raise ValueError

Examples

Input: [(0.0, ['a', 'b', 'a', 'c']), (1.0, ['a', 'a', 'b', 'c'])]

Expected Output: [(0.0, 's', 'a'), (0.0, 's', 'b'), (0.0, 's', 'a'), (0.0, 's', 'c'), (1.0, 'e', 'c'), (1.0, 'e', 'a'), (1.0, 'e', 'b'), (1.0, 's', 'a'), (1.0, 's', 'b'), (1.0, 's', 'c'), (1.0, 'e', 'c'), (1.0, 'e', 'b'), (1.0, 'e', 'a'), (1.0, 'e', 'a')]

Explanation: The common prefix is only the first 'a'. So at t=1.0, end c, a, b; then start a, b, c. After the last sample, close the remaining stack c, b, a, a at t=1.0.

Input: []

Expected Output: []

Explanation: No samples means no events.

Input: [(0.0, []), (1.0, ['x']), (2.0, [])]

Expected Output: [(1.0, 's', 'x'), (2.0, 'e', 'x')]

Explanation: The initial empty stack emits nothing. At t=1.0, x starts. At t=2.0, x ends. The final stack is empty, so there is nothing left to close.

Input: [(0.0, ['main']), (0.0, ['main', 'worker']), (0.0, ['main'])]

Expected Output: [(0.0, 's', 'main'), (0.0, 's', 'worker'), (0.0, 'e', 'worker'), (0.0, 'e', 'main')]

Explanation: Equal timestamps are allowed. Process transitions in the given order: start main, then start worker, then end worker, then finalize by ending main.

Input: [(0.0, ['a']), (1.0, ['a']), (2.0, ['a', 'b'])]

Expected Output: [(0.0, 's', 'a'), (2.0, 's', 'b'), (2.0, 'e', 'b'), (2.0, 'e', 'a')]

Explanation: The middle sample is identical to the first, so that transition emits nothing. At t=2.0, b starts. Then the capture ends, so b and a are closed at t=2.0.

Input: [(5.0, ['root', 'f', 'f'])]

Expected Output: [(5.0, 's', 'root'), (5.0, 's', 'f'), (5.0, 's', 'f'), (5.0, 'e', 'f'), (5.0, 'e', 'f'), (5.0, 'e', 'root')]

Explanation: Duplicate names at different depths are separate frames. The single sample opens the whole stack, and finalization closes it deepest-first at the same timestamp.

Hints

  1. For two adjacent samples, the frames before the first differing depth are the only ones guaranteed to still be running.
  2. Emit ends before starts at each transition, and treat duplicate function names as different frames if they occur at different depths.
Last updated: May 8, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement a Banking System - Anthropic (medium)
  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement Task Management and Duplicate Detection - Anthropic (medium)