Find longest substring with at most k distinct
Company: Snowflake
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to design efficient string-processing algorithms using sliding-window techniques, reason about time and space complexity, and correctly handle Unicode grapheme clusters when returning substrings.
Constraints
- 0 <= k <= length of s
- 0 <= length of s <= 10^5
- s consists of arbitrary characters (treat each code unit as one character for the core problem)
- If k <= 0, the answer is 0
Examples
Input: ("eceba", 2)
Expected Output: 3
Explanation: The longest substring with at most 2 distinct characters is "ece" (length 3).
Input: ("aa", 1)
Expected Output: 2
Explanation: With k=1, the whole string "aa" qualifies (one distinct character).
Input: ("abcabcabc", 3)
Expected Output: 9
Explanation: All 3 distinct characters are allowed, so the entire string qualifies.
Input: ("abcabcabc", 2)
Expected Output: 2
Explanation: Any window of length 3 here has 3 distinct characters, so the max valid window is length 2.
Input: ("", 5)
Expected Output: 0
Explanation: An empty string has no substring; the answer is 0.
Input: ("a", 0)
Expected Output: 0
Explanation: With k=0 no character is allowed, so the longest valid substring has length 0.
Input: ("aabbcc", 1)
Expected Output: 2
Explanation: With k=1 the best is a run of two identical characters, e.g. "aa" (length 2).
Input: ("aaabbbcccddd", 2)
Expected Output: 6
Explanation: A window spanning two adjacent runs, e.g. "aaabbb", has 2 distinct characters and length 6.
Input: ("x", 3)
Expected Output: 1
Explanation: A single-character string yields length 1 when k allows at least 1 distinct character.
Hints
- Use two pointers (left and right) to define a sliding window, and a hash map to count occurrences of each character inside the window.
- Expand by moving `right` one step at a time. Whenever the count of distinct characters exceeds k, move `left` forward — decrementing counts — until the window again has at most k distinct characters.
- Track the best window length after each expansion. Because every character is added and removed at most once, the total work is O(n).