Solve String Arrays and Row Deduplication
Company: Netflix
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This multi-part question evaluates array and string manipulation skills, deduplication logic and pairwise character comparison, as well as the ability to reason about algorithmic complexity and data-structure choices; it belongs to the Coding & Algorithms domain.
Part 1: Longest Run of Equal Strings
Constraints
- 0 <= len(arr) <= 200000
- Each element of `arr` is a string and may be empty
- Total number of characters across all strings is at most 1000000
Examples
Input: ['a', 'a', 'b', 'b', 'b', 'a']
Expected Output: ['b', 'b', 'b']
Explanation: The longest contiguous block of equal strings is the run of three `'b'` values.
Input: ['x', 'y', 'y', 'z', 'z']
Expected Output: ['y', 'y']
Explanation: There are two longest runs of length 2: `['y', 'y']` and `['z', 'z']`. Return the earliest one.
Input: ['solo']
Expected Output: ['solo']
Explanation: A single-element array has a longest run of length 1.
Input: []
Expected Output: []
Explanation: Empty input should return an empty list.
Hints
- Scan from left to right and keep track of where the current run started.
- Whenever the value changes, compare the finished run against the best run seen so far.
Part 2: Longest Subarray Without Duplicate Strings
Constraints
- 0 <= len(arr) <= 200000
- Each element of `arr` is a string and may be empty
- Total number of characters across all strings is at most 1000000
Examples
Input: ['a', 'b', 'c', 'a', 'd']
Expected Output: ['b', 'c', 'a', 'd']
Explanation: The longest window with all unique strings is from index 1 to 4.
Input: ['x', 'y', 'z']
Expected Output: ['x', 'y', 'z']
Explanation: All strings are already distinct, so the whole array is the answer.
Input: ['x', 'x', 'x']
Expected Output: ['x']
Explanation: Every valid subarray has length 1. Return the earliest one.
Input: []
Expected Output: []
Explanation: Empty input has no subarray.
Hints
- A sliding window works well when you need a longest contiguous segment with a constraint.
- Store the last seen index of each string so you can move the left side of the window efficiently.
Part 3: Find String Pairs With No Shared Characters
Constraints
- 0 <= len(words) <= 2000
- 0 <= len(words[i]) <= 1000
- The sum of all word lengths is at most 200000
Examples
Input: ['a', 'ab', 'b']
Expected Output: [[0, 2]]
Explanation: `'a'` and `'b'` share no character, but `'ab'` overlaps with both.
Input: ['abc', 'de', 'fa']
Expected Output: [[0, 1], [1, 2]]
Explanation: `'abc'` and `'de'` are disjoint, `'de'` and `'fa'` are disjoint, but `'abc'` and `'fa'` both contain `'a'`.
Input: ['', 'ab', 'c']
Expected Output: [[0, 1], [0, 2], [1, 2]]
Explanation: The empty string shares no characters with any word, and `'ab'` and `'c'` are also disjoint.
Input: []
Expected Output: []
Explanation: No words means no valid pairs.
Hints
- Precompute a character set for each word so you do not rebuild it for every pair.
- Two words are valid together exactly when their character sets are disjoint.
Part 4: Deduplicate Items in Content Rows
Constraints
- 0 <= len(rows) <= 10000
- len(rows) == len(thresholds) == len(exempt_rows)
- The total number of IDs across all rows is at most 200000
Examples
Input: ([[1, 2, 1, 3, 2, 4], [5, 5, 6], [7, 8, 7]], [3, -1, 2], 2, [False, False, False])
Expected Output: [[1, 2, 3, 2, 4], [5, 6], [7, 8, 7]]
Explanation: Row 0 deduplicates only until 3 unique items are produced, row 1 is fully deduplicated because its threshold is negative, and row 2 is unchanged because it is beyond the first 2 rows.
Input: ([[1, 1, 2], [3, 3, 4, 4]], [10, 10], 2, [True, False])
Expected Output: [[1, 1, 2], [3, 4]]
Explanation: The first row is exempt, so it stays unchanged. The second row is fully deduplicated.
Input: ([[1, 2, 1, 2]], [0], 1, [False])
Expected Output: [[1, 2, 1, 2]]
Explanation: A threshold of 0 means the row may be left entirely as-is.
Input: ([[], [9, 9, 9]], [-1, -1], -1, [False, False])
Expected Output: [[], [9]]
Explanation: A negative `first_n` means all rows are eligible. The empty row stays empty, and the second row is fully deduplicated.
Hints
- Process each row independently with a `seen` set while deduplication is active.
- Before touching a row, first decide whether it is eligible based on `first_n` and the exemption flag.