Count super-streak segments in an event stream
Company: Google
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
You are given a time-ordered sequence of events. Each event has:
- `type`, a string or integer event type.
- `ts`, an integer timestamp in milliseconds or seconds.
A **streak segment** is a maximal contiguous subsequence of the input where:
1. All events have the same `type`.
2. For every adjacent pair in the segment, the time gap satisfies `ts[i] - ts[i-1] <= T`.
A streak segment is a **super streak** if:
- The number of events in the segment is at least `N`.
- The duration `ts[last] - ts[first]` is at least `X`.
Implement a function that returns the number of super streak segments.
### Constraints & Assumptions
- Events are already sorted by timestamp ascending.
- `T`, `N`, and `X` are non-negative.
- The gap threshold is inclusive: a gap exactly equal to `T` continues the segment.
- A single event has duration 0.
- Count maximal streak segments, not every subarray inside a streak.
### Clarifying Questions to Ask
- Can timestamps be equal?
- Can `N` be 0 or 1, and how should that be interpreted?
- Is the input too large to store, or can it be processed as a stream?
- Should the function return only the count or also the segment boundaries?
### Part 1 - Define Segment Boundaries
When does a current streak segment end?
#### What This Part Should Cover
- A segment ends when the next event has a different type or the time gap exceeds `T`.
- The current segment must be evaluated once at the boundary.
- Maximality means you should not count subsegments separately.
### Part 2 - Design The Algorithm
How would you count super streak segments efficiently?
#### What This Part Should Cover
- One pass through the events.
- Maintain current segment type, first timestamp, last timestamp, and count.
- Evaluate the segment at each break and after the loop.
### Part 3 - Analyze Complexity And Edge Cases
What is the time and space complexity, and what edge cases matter?
#### What This Part Should Cover
- `O(n)` time and `O(1)` extra space.
- Empty input, single event, exact threshold gap, timestamp duplicates, `X = 0`, `N = 1`, and type changes.
### Part 4 - Handle Changing Parameters Over Time
How would you adapt the solution if `N`, `T`, or `X` can change over time?
#### What This Part Should Cover
- If thresholds are attached to each event or time window, segment-breaking rules must use the active `T` at the relevant adjacent gap.
- For changing `N` or `X`, evaluate completed segments against the thresholds active for that segment or query.
- For many offline threshold queries, precompute segment summaries and answer queries from those summaries rather than rescanning raw events.
### What a Strong Answer Covers
- Correctly counts maximal segments only.
- Handles the final open segment.
- Explains inclusive gap logic and duration.
- Provides a clean implementation and discusses streaming and changing-threshold follow-ups.
### Follow-up Questions
- How would you return the actual segment ranges?
- What if events are not sorted?
- What if the data arrives as an unbounded stream?
- How would you answer many different `(N, T, X)` queries efficiently?
Quick Answer: Count super-streak segments in a time-ordered event stream with an O(n) scan. The solution explains maximal segment boundaries, inclusive gap handling, duration and count checks, edge cases, streaming behavior, and changing-threshold follow-ups.
Count maximal same-type event segments whose adjacent gaps are <= T, length >= N, and duration >= X.
Constraints
- Events are sorted by timestamp
- Gap threshold is inclusive
Examples
Input: ([('a', 1), ('a', 2), ('a', 5), ('b', 6), ('b', 7)], 3, 2, 1)
Expected Output: 2
Explanation: Two qualifying segments.
Input: ([], 1, 1, 0)
Expected Output: 0
Explanation: No events.
Input: ([('a', 1)], 0, 1, 0)
Expected Output: 1
Explanation: Single event can qualify when N=1 and X=0.
Input: ([('a', 1), ('a', 4)], 3, 2, 3)
Expected Output: 1
Explanation: Gap exactly T continues segment.
Hints
- Evaluate each maximal segment exactly once when it breaks.