Implement Interval Overrides and Top-K Strings
Company: Crusoe
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Onsite
Quick Answer: This question evaluates interval manipulation and merging for half-open time ranges with overrides, plus frequency-counting and top-k string selection with lexicographic tie-breaking, testing competence in handling ranges, ordering, and aggregated counts.
Part 1: Apply Override Intervals
Constraints
- 0 <= len(usage), len(overrides) <= 2 * 10^5
- Each interval has integer endpoints with start < end
- Usage intervals are pairwise non-overlapping, but may be unsorted
- Override intervals are pairwise non-overlapping, but may be unsorted
- Adjacent output intervals with the same value must be merged
Examples
Input: ([[1, 3, 'normal'], [6, 12, 'normal'], [16, 28, 'normal']], [[2, 8, 'overrideA'], [20, 24, 'overrideB']])
Expected Output: [[1, 2, 'normal'], [2, 3, 'overrideA'], [6, 8, 'overrideA'], [8, 12, 'normal'], [16, 20, 'normal'], [20, 24, 'overrideB'], [24, 28, 'normal']]
Explanation: Each overlapping portion is replaced by the override value, while untouched sections keep the original value.
Input: ([[5, 7, 'a'], [1, 3, 'a'], [3, 5, 'a']], [])
Expected Output: [[1, 7, 'a']]
Explanation: There are no overrides. After sorting, adjacent usage intervals with the same value are merged.
Input: ([[1, 5, 'base']], [[0, 10, 'x']])
Expected Output: [[1, 5, 'x']]
Explanation: The override fully covers the usage interval, so the entire interval takes the override value.
Input: ([[10, 15, 'n'], [1, 4, 'n']], [[12, 14, 'y'], [3, 10, 'x']])
Expected Output: [[1, 3, 'n'], [3, 4, 'x'], [10, 12, 'n'], [12, 14, 'y'], [14, 15, 'n']]
Explanation: The override [3, 10) affects [1, 4) but does not affect time 10 because the intervals are half-open.
Input: ([], [[1, 3, 'x']])
Expected Output: []
Explanation: If there are no usage intervals, the result is empty.
Hints
- Sort both lists by start time, then sweep through them with pointers instead of checking every pair.
- For each usage interval, split it only where an override starts or ends, and merge intervals as you append them.
Part 2: Return the Most Frequent Strings
Constraints
- 0 <= len(words) <= 2 * 10^5
- 0 <= k <= number of unique words
- Each word is a non-empty lowercase English string
- The target time complexity is O(n log k)
Examples
Input: (['dog', 'cat', 'dog', 'bird', 'cat', 'dog', 'ant'], 2)
Expected Output: ['dog', 'cat']
Explanation: dog appears 3 times, cat appears 2 times, and all others appear once.
Input: (['b', 'a', 'c', 'b', 'a'], 2)
Expected Output: ['a', 'b']
Explanation: a and b both appear twice, so the lexicographically smaller word a comes first.
Input: (['solo'], 1)
Expected Output: ['solo']
Explanation: A single word with k = 1 should return that word.
Input: ([], 0)
Expected Output: []
Explanation: An empty input with k = 0 returns an empty list.
Hints
- Count frequencies first, then keep only the best k candidates in a min-heap.
- For heap ordering, the worst item should be removed first: lower frequency is worse, and for equal frequency, lexicographically larger is worse.