Merge overlapping corrections
Company: Grammarly
Role: Software Engineer
Category: Coding & Algorithms
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to manage and manipulate overlapping intervals with multi-criteria tie-breaking, preserve original metadata across splits, and design efficient data structures and algorithms for producing non-overlapping correction spans.
Constraints
- 0 <= len(corrections) <= 2 * 10^5
- 0 <= start <= end <= 10^9
- -10^9 <= priority <= 10^9
- A solution that scans every position in every interval will time out
Examples
Input: ([(1, 10, 1, "UPPERCASE"), (3, 6, 1, "LOWERCASE")],)
Expected Output: [(1, 2, 1, "UPPERCASE"), (3, 6, 1, "LOWERCASE"), (7, 10, 1, "UPPERCASE")]
Explanation: Both corrections have priority 1, but `(3, 6)` has the smaller original span length, so it wins on the overlap. The larger interval survives on both sides.
Input: ([(0, 4, 1, "DELETE"), (2, 3, 5, "UPDATE")],)
Expected Output: [(0, 1, 1, "DELETE"), (2, 3, 5, "UPDATE"), (4, 4, 1, "DELETE")]
Explanation: The `UPDATE` correction has higher priority on positions 2 and 3, so the `DELETE` correction is split into the remaining left and right pieces.
Input: ([(1, 10, 1, "UPPERCASE"), (3, 6, 1, "LOWERCASE"), (5, 8, 1, "DELETE")],)
Expected Output: [(1, 2, 1, "UPPERCASE"), (3, 6, 1, "LOWERCASE"), (7, 8, 1, "DELETE"), (9, 10, 1, "UPPERCASE")]
Explanation: On 5..6, `LOWERCASE` and `DELETE` tie on priority and original length, so `LOWERCASE` wins because it appeared earlier. On 7..8, `DELETE` beats the surviving `UPPERCASE` piece because `UPPERCASE` must still compare using its original length from `(1, 10)`, not the shorter fragment length.
Input: ([(4, 7, 2, "LOWERCASE"), (4, 7, 2, "UPPERCASE"), (5, 5, 3, "DELETE")],)
Expected Output: [(4, 4, 2, "LOWERCASE"), (5, 5, 3, "DELETE"), (6, 7, 2, "LOWERCASE")]
Explanation: At position 5, `DELETE` wins because it has higher priority. Outside that point, the first interval wins the tie against the second because both spans and priorities are identical and input order breaks the tie.
Input: ([],)
Expected Output: []
Explanation: No corrections means there are no output segments.
Input: ([(1, 5, 3, "UPDATE"), (2, 2, 1, "DELETE"), (4, 4, 1, "LOWERCASE")],)
Expected Output: [(1, 5, 3, "UPDATE")]
Explanation: The lower-priority single-point corrections never win, so the stronger `UPDATE` span remains one merged segment rather than being split at internal event boundaries.
Input: ([(1, 2, 1, "UPDATE"), (3, 5, 1, "UPDATE")],)
Expected Output: [(1, 5, 1, "UPDATE")]
Explanation: These intervals do not overlap, but they are adjacent and have identical returned fields, so the minimal output merges them.
Hints
- Convert each inclusive interval `[start, end]` into a half-open interval `[start, end + 1)` and sweep only across event coordinates where something starts or stops.
- Maintain the currently active corrections in a priority queue ordered by `(-priority, original_length, input_index)`. Use lazy removal for intervals that have already ended.