PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's ability to design an efficient hashing and frequency-counting solution over sequences, along with careful handling of custom tie-breaking rules. It falls under coding and algorithms, testing practical application of data structures for grouping composite keys (ordered sequences) at scale, plus precise attention to edge cases and multi-part follow-up requirements.

  • medium
  • Roblox
  • Coding & Algorithms
  • Software Engineer

Most Frequent Call Stack from Profiler Samples

Company: Roblox

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

A sampling profiler periodically takes a snapshot of a running program's **call stack**. Each snapshot is the chain of active function calls at that instant, listed from the **outermost** caller (for example `main`) down to the **innermost** function that is currently executing. The same stack can be captured many times across the run. Given a list of such snapshots, return the **call stack that was captured most often**. A call stack is represented as a list of function-name strings, ordered outermost → innermost. Two stacks are considered identical only if they have the same frames in the same order (so `["main", "a", "b"]` and `["main", "b", "a"]` are different stacks). Implement: ```python def most_frequent_call_stack(samples: list[list[str]]) -> list[str]: ... ``` `samples[i]` is one snapshot. Return the full stack (the list of frames) that appears the most times. **Tie-breaking.** If two or more distinct stacks are tied for the highest count, break ties in this order: 1. Prefer the **deeper** stack (the one with more frames). 2. If still tied, prefer the stack that is **lexicographically smaller** when its frames are compared element by element (compare frame 0 first, then frame 1, and so on). If `samples` is empty, return `[]`. ### Constraints - `0 <= len(samples) <= 10^5` - `1 <= len(samples[i]) <= 500` for every non-empty snapshot. - The total number of frames across all snapshots is at most `10^6`. - Each frame is a non-empty string of up to 64 ASCII characters (letters, digits, `_`, `.`, `:`). ### Examples - `samples = [["main","handleRequest","parseJson"], ["main","handleRequest","parseJson"], ["main","handleRequest"], ["main","render","drawText"]]` → `["main","handleRequest","parseJson"]` (the `parseJson` stack occurs twice; every other stack occurs once.) - `samples = [["a"], ["a","b"], ["c","d"]]` → `["a","b"]` (every stack occurs once → tie; the deepest stack `["a","b"]` wins.) - `samples = [["x","y"], ["x","z"]]` → `["x","y"]` (both occur once and both have depth 2 → tie broken lexicographically: `"y" < "z"`.) - `samples = []` → `[]` ### Follow-up (extension) Now each snapshot is tagged with the **thread id** it came from: the input is a list of `(thread_id, stack)` pairs. Return a mapping from each `thread_id` to that thread's own most-frequent stack (using the same tie-breaking rules independently per thread). Reuse the single-thread logic above as the per-thread building block.

Quick Answer: This question evaluates a candidate's ability to design an efficient hashing and frequency-counting solution over sequences, along with careful handling of custom tie-breaking rules. It falls under coding and algorithms, testing practical application of data structures for grouping composite keys (ordered sequences) at scale, plus precise attention to edge cases and multi-part follow-up requirements.

Most Frequent Call Stack from Profiler Samples

A sampling profiler periodically snapshots a running program's **call stack** — the chain of active function calls at that instant, listed **outermost caller** (e.g. `main`) down to the **innermost** executing function. The same stack can be captured many times. Given a list of such snapshots, return the **call stack captured most often**. A call stack is a list of frame strings ordered outermost -> innermost. Two stacks are identical only if they have the same frames in the same order (so `["main","a","b"]` != `["main","b","a"]`). ```python def most_frequent_call_stack(samples: list[list[str]]) -> list[str]: ... ``` **Tie-breaking** (when >=2 distinct stacks share the highest count): 1. Prefer the **deeper** stack (more frames). 2. If still tied, prefer the stack that is **lexicographically smaller** comparing frames element by element. If `samples` is empty, return `[]`.

Constraints

  • 0 <= len(samples) <= 10^5
  • 1 <= len(samples[i]) <= 500 for every non-empty snapshot
  • Total number of frames across all snapshots is at most 10^6
  • Each frame is a non-empty string of up to 64 ASCII characters (letters, digits, '_', '.', ':')

Examples

Input: ([['main','handleRequest','parseJson'],['main','handleRequest','parseJson'],['main','handleRequest'],['main','render','drawText']],)

Expected Output: ['main', 'handleRequest', 'parseJson']

Explanation: The parseJson stack occurs twice; every other stack occurs once, so it wins outright.

Input: ([['a'],['a','b'],['c','d']],)

Expected Output: ['a', 'b']

Explanation: All three stacks occur once (a tie). The deepest stacks are ['a','b'] and ['c','d'] (depth 2); between them ['a','b'] is lexicographically smaller.

Input: ([['x','y'],['x','z']],)

Expected Output: ['x', 'y']

Explanation: Both occur once and both have depth 2, so the tie is broken lexicographically: 'y' < 'z'.

Input: ([],)

Expected Output: []

Explanation: Empty input returns an empty stack.

Input: ([['solo']],)

Expected Output: ['solo']

Explanation: A single snapshot is trivially the most frequent.

Input: ([['a','b','c'],['a','b','c'],['d'],['d']],)

Expected Output: ['a', 'b', 'c']

Explanation: Both stacks occur twice (count tie); the deeper ['a','b','c'] beats ['d'].

Input: ([['b','a'],['b','a'],['a','z'],['a','z']],)

Expected Output: ['a', 'z']

Explanation: Both occur twice with depth 2; lexicographic comparison of frame 0 gives 'a' < 'b', so ['a','z'] wins.

Hints

  1. A whole stack (an ordered list of frames) is the key you are counting. Convert each list to a hashable tuple and tally counts with a hash map.
  2. You need three ranking criteria at once: highest count, then greatest depth, then lexicographically smallest frames. Encode them in a single sort key.
  3. In Python, `min(counts, key=lambda st: (-counts[st], -len(st), st))` returns the winner: negating count and depth turns 'maximize' into 'minimize', and the tuple itself breaks the final lexicographic tie.

Most Frequent Call Stack per Thread (Follow-up)

Extend the profiler analysis: now each snapshot is tagged with the **thread id** it came from. The input is a list of `(thread_id, stack)` pairs. ```python def most_frequent_call_stack_per_thread(tagged_samples): ... ``` Return a **mapping** from each `thread_id` to that thread's own most-frequent call stack, applying the **same tie-breaking rules independently per thread**: 1. Highest count. 2. Then the deeper stack. 3. Then the lexicographically smaller stack. Reuse the single-thread logic from the previous part as the per-thread building block. If `tagged_samples` is empty, return an empty mapping.

Constraints

  • 0 <= number of (thread_id, stack) pairs <= 10^5
  • 1 <= len(stack) <= 500 for every snapshot
  • Total number of frames across all snapshots is at most 10^6
  • thread_id is any hashable value; each frame is a non-empty ASCII string of up to 64 characters

Examples

Input: ([('t1', ['main','a']), ('t1', ['main','a']), ('t2', ['x']), ('t2', ['x','y'])],)

Expected Output: {'t1': ['main', 'a'], 't2': ['x', 'y']}

Explanation: Thread t1 sees ['main','a'] twice (clear winner). Thread t2 sees two distinct stacks each once, so the deeper ['x','y'] wins.

Input: ([],)

Expected Output: {}

Explanation: No samples means an empty mapping.

Input: ([('t', ['f'])],)

Expected Output: {'t': ['f']}

Explanation: A single tagged snapshot maps its thread to that lone stack.

Input: ([('A', ['m','p']), ('A', ['m','p']), ('A', ['m','q']), ('B', ['b','a']), ('B', ['a','z'])],)

Expected Output: {'A': ['m', 'p'], 'B': ['a', 'z']}

Explanation: Thread A: ['m','p'] occurs twice vs ['m','q'] once. Thread B: two stacks tie at count 1 and depth 2, broken lexicographically ('a' < 'b') to ['a','z'].

Input: ([(1, ['x']), (1, ['y','z']), (1, ['x'])],)

Expected Output: {1: ['x']}

Explanation: Non-string thread ids work too; thread 1 sees ['x'] twice, beating ['y','z'].

Hints

  1. First partition the pairs by thread id into buckets, each bucket holding just that thread's list of stacks.
  2. Run the exact single-thread routine from Part 1 on each bucket independently — do not share counts across threads.
  3. Return a dict/map from thread id to the winning stack. An empty input yields an empty map.
Last updated: Jul 1, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Level-Ordered Dependency Build Order - Roblox (medium)
  • Find Windows Containing a Target - Roblox (medium)
  • Implement Sliding-Window Rate Limiter - Roblox (medium)
  • Find target-heavy sliding windows - Roblox (medium)
  • Find most frequent call path in logs - Roblox (medium)