Process time intervals and detect overlaps/gaps
Company: Sierra
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Take-home Project
You are given `n` time intervals representing debug log segments. Each interval has a start time, an end time, and an associated text message.
- Each interval is represented as `(start, end, text)`.
- `start` and `end` are integers representing timestamps in seconds.
- Assume all intervals are half-open: they cover `[start, end)`, i.e., they include `start` and exclude `end`.
- The input intervals are not guaranteed to be in order and may overlap or leave gaps.
### Task 1: Output ordered, merged intervals
Write a function that:
1. Takes as input a list of intervals `[(start_i, end_i, text_i)]`.
2. Sorts them by time and merges overlapping intervals.
- When two or more intervals overlap in time, merge them into a single interval `[min_start, max_end)` where:
- `min_start` is the smallest start time among those intervals.
- `max_end` is the largest end time among those intervals.
- For the merged interval's text, concatenate the texts of the merged intervals in chronological order, separated by a single space.
3. Returns the merged, non-overlapping intervals in ascending order of start time as a list of `(start, end, merged_text)`.
Example (one possible style of input/output):
- Input:
- `[(0, 5, "A"), (3, 10, "B"), (12, 15, "C")]`
- Expected output:
- `[(0, 10, "A B"), (12, 15, "C")]`
(You can assume that the concatenation order for texts is determined by the start time of each interval; if start times are equal, you may break ties arbitrarily.)
### Task 2 (Follow-up): Detect overlaps and gaps
Using the same input model (original, unmerged intervals):
1. Determine whether **any** pair of intervals overlaps in time.
- Two intervals `[s1, e1)` and `[s2, e2)` overlap if their intersection is non-empty, i.e., `max(s1, s2) < min(e1, e2)`.
2. Determine whether there exists **any gap** between the earliest start time and the latest end time.
- A gap exists if, after ordering the intervals by start time, there is at least one pair of consecutive intervals where `next.start > current.end`.
Define and implement appropriate functions/methods to:
- Produce the ordered, merged list of intervals with concatenated texts (Task 1).
- Return booleans or equivalent indicators for:
- "there is at least one overlap" and
- "there is at least one gap" (Task 2).
State the expected time and space complexity of your approach in terms of `n` (the number of intervals).
Quick Answer: This question evaluates the ability to process and manipulate time intervals: sorting and merging overlapping ranges, concatenating associated textual metadata, detecting overlaps and gaps, and reasoning about time and space complexity.
Part 1: Merge Overlapping Log Segments
You are given an unsorted list of debug log intervals. Each interval is a tuple (start, end, text), representing the half-open time range [start, end) and its message. Sort the intervals by time and merge every set of overlapping intervals into one interval. Two intervals overlap only if their intersection is non-empty, so intervals that only touch at an endpoint, such as [1, 4) and [4, 6), must not be merged. For each merged block, concatenate the texts of all intervals inside it in chronological order, separated by a single space. For deterministic output, if two intervals have the same start time, order them by smaller end time first; if both start and end are equal, keep their original input order.
Constraints
- 0 <= n <= 200000, where n is the number of intervals
- 0 <= start < end <= 10^9
- Each text is a non-empty string
- Intervals are half-open: [start, end)
Examples
Input: [(0, 5, 'A'), (3, 10, 'B'), (12, 15, 'C')]
Expected Output: [(0, 10, 'A B'), (12, 15, 'C')]
Explanation: The first two intervals overlap, so they merge into [0, 10). The third interval is separate.
Input: [(1, 4, 'X'), (4, 6, 'Y'), (5, 8, 'Z')]
Expected Output: [(1, 4, 'X'), (4, 8, 'Y Z')]
Explanation: The first and second intervals only touch, so they do not merge. The second and third do overlap and merge.
Input: [(5, 10, 'A'), (1, 7, 'B'), (2, 3, 'C')]
Expected Output: [(1, 10, 'B C A')]
Explanation: After sorting, all three intervals are connected by overlap, so they become one merged interval.
Input: []
Expected Output: []
Explanation: An empty input produces an empty output.
Input: [(2, 6, 'Only')]
Expected Output: [(2, 6, 'Only')]
Explanation: A single interval is already merged.
Hints
- Sorting by start time makes it possible to process intervals from left to right.
- Keep track of the current merged interval's end time; a new interval overlaps only if its start is strictly less than that end.
Part 2: Detect Overlaps and Coverage Gaps
You are given an unsorted list of debug log intervals. Each interval is a tuple (start, end, text), representing the half-open time range [start, end). Ignore the text content for this task. Return two booleans: whether any pair of intervals overlaps, and whether there is any uncovered gap between the earliest start time and the latest end time. Because intervals are half-open, [1, 4) and [4, 6) are adjacent but neither overlap nor create a gap. For an empty list, return (False, False).
Constraints
- 0 <= n <= 200000, where n is the number of intervals
- 0 <= start < end <= 10^9
- Each interval is half-open: [start, end)
- The text field is present in the input but does not affect the answer
Examples
Input: [(0, 5, 'A'), (3, 10, 'B'), (12, 15, 'C')]
Expected Output: (True, True)
Explanation: There is overlap between [0, 5) and [3, 10), and there is a gap between 10 and 12.
Input: [(0, 2, 'A'), (2, 5, 'B')]
Expected Output: (False, False)
Explanation: The intervals are adjacent. They do not overlap, and there is no uncovered gap.
Input: [(0, 10, 'A'), (2, 3, 'B'), (9, 12, 'C')]
Expected Output: (True, False)
Explanation: There is overlap, but no gap because the union covers the whole span from 0 to 12.
Input: [(5, 7, 'A'), (1, 3, 'B')]
Expected Output: (False, True)
Explanation: The intervals are disjoint, so there is a gap between 3 and 5.
Input: []
Expected Output: (False, False)
Explanation: With no intervals, there are no overlaps and no gaps to report.
Hints
- Sort by start time, then scan from left to right while tracking the furthest end seen so far.
- A new interval causes an overlap if it starts before the current covered end, and causes a gap if it starts after the current covered end.