PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of array and sequence processing, run-length encoding concepts, and interval representation for time-ordered traces, including handling boundary conditions and filtering runs by a minimum length k.

  • medium
  • Anthropic
  • Coding & Algorithms
  • Machine Learning Engineer

Convert Samples into Event Intervals

Company: Anthropic

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given a time-ordered array `samples`, where `samples[i]` is the function name observed at integer timestamp `i`. Convert this trace into a list of events in the form `(function_name, start_time, end_time)`, where each event represents one maximal contiguous run of the same function. Use half-open intervals `[start_time, end_time)`. For example, if `samples = ["A", "A", "B", "B", "B", "A"]`, the output should be `[("A", 0, 2), ("B", 2, 5), ("A", 5, 6)]`. Follow-up: given an integer `k`, only treat a run as a valid event if the same function appears for at least `k` consecutive timestamps. Runs shorter than `k` should be discarded. Do not merge runs that were separated in the original trace, even if they have the same function name. Implement the transformation and discuss the time and space complexity.

Quick Answer: This question evaluates understanding of array and sequence processing, run-length encoding concepts, and interval representation for time-ordered traces, including handling boundary conditions and filtering runs by a minimum length k.

Part 1: Convert Samples into Maximal Contiguous Event Intervals

You are given a time-ordered list `samples`, where `samples[i]` is the function name observed at integer timestamp `i`. Convert this trace into a list of events in the form `(function_name, start_time, end_time)`, where each event represents one maximal contiguous run of the same function. Use half-open intervals `[start_time, end_time)`. That means an event starting at `s` and ending at `e` covers timestamps `s, s+1, ..., e-1`. For example, if `samples = ["A", "A", "B", "B", "B", "A"]`, the output is `[("A", 0, 2), ("B", 2, 5), ("A", 5, 6)]`. Two runs with the same function name should remain separate if they were separated anywhere in the original trace.

Constraints

  • 0 <= len(samples) <= 200000
  • Each `samples[i]` is a string
  • Timestamps are the indices `0` through `len(samples) - 1`
  • The output must use half-open intervals `[start_time, end_time)`

Examples

Input: (["A", "A", "B", "B", "B", "A"],)

Expected Output: [("A", 0, 2), ("B", 2, 5), ("A", 5, 6)]

Explanation: There are three maximal runs: A from 0 to 2, B from 2 to 5, and A from 5 to 6.

Input: ([],)

Expected Output: []

Explanation: Empty input produces no events.

Input: (["X"],)

Expected Output: [("X", 0, 1)]

Explanation: A single sample forms one run covering exactly one timestamp.

Input: (["A", "B", "A"],)

Expected Output: [("A", 0, 1), ("B", 1, 2), ("A", 2, 3)]

Explanation: The two A runs are separated by B, so they stay as separate events.

Input: (["Q", "Q", "Q", "Q"],)

Expected Output: [("Q", 0, 4)]

Explanation: All samples belong to one contiguous run.

Hints

  1. Scan from left to right and keep track of where the current run started.
  2. Whenever the function name changes, close the previous run and start a new one.

Part 2: Convert Samples into Event Intervals with Minimum Run Length k

You are given a time-ordered list `samples`, where `samples[i]` is the function name observed at integer timestamp `i`, and an integer `k`. Convert the trace into a list of events `(function_name, start_time, end_time)`, but only keep a run if the same function appears for at least `k` consecutive timestamps. Use half-open intervals `[start_time, end_time)`. Runs shorter than `k` must be discarded. Important: do not merge runs that were separated in the original trace, even if the separated runs have the same function name and the run between them was discarded. Example: if `samples = ["A", "A", "B", "B", "B", "A"]` and `k = 2`, the output is `[("A", 0, 2), ("B", 2, 5)]` because the final `A` run has length 1 and is discarded.

Constraints

  • 0 <= len(samples) <= 200000
  • Each `samples[i]` is a string
  • 1 <= k <= 1000000000
  • Do not merge separate runs, even if short runs between them are discarded
  • The output must use half-open intervals `[start_time, end_time)`

Examples

Input: (["A", "A", "B", "B", "B", "A"], 2)

Expected Output: [("A", 0, 2), ("B", 2, 5)]

Explanation: The last A run has length 1, so it is discarded.

Input: (["A", "A", "B", "A", "A"], 2)

Expected Output: [("A", 0, 2), ("A", 3, 5)]

Explanation: The middle B run is discarded, but the two A runs must remain separate because they were not contiguous in the original trace.

Input: (["A", "B", "A", "A", "C", "C", "C", "A"], 2)

Expected Output: [("A", 2, 4), ("C", 4, 7)]

Explanation: Only the A run from 2 to 4 and the C run from 4 to 7 meet the minimum length.

Input: ([], 3)

Expected Output: []

Explanation: Empty input produces no events.

Input: (["Z", "Z", "Z"], 1)

Expected Output: [("Z", 0, 3)]

Explanation: With k = 1, every run is valid, so the whole array becomes one event.

Hints

  1. First identify each maximal contiguous run exactly as in Part 1.
  2. When a run ends, compute its length as `end - start` and append it only if that length is at least `k`.
Last updated: Apr 28, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement a Banking System - Anthropic (medium)
  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement a Simplified DNS Resolver - Anthropic (hard)