Solve two-pointer, sliding-window, and string tasks
Company: Microsoft
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This multi-part question evaluates algorithmic problem-solving skills across two-pointer in-place array manipulation, sliding-window substring optimization, and single-pass string sanitization with attention to Unicode, grapheme clusters, and space-efficient in-place transformations.
Part 1: Two-Pointer In-Place De-duplication
Constraints
- 0 <= len(nums) <= 200000
- -10^9 <= nums[i] <= 10^9
- nums is sorted in non-decreasing order
- 1 <= k <= 200000
Examples
Input: ([], 1)
Expected Output: (0, [])
Explanation: Edge case: empty input stays empty.
Input: ([2, 2, 2], 1)
Expected Output: (1, [2])
Explanation: Only one copy of 2 may remain.
Input: ([1, 1, 1, 2, 2, 3], 2)
Expected Output: (5, [1, 1, 2, 2, 3])
Explanation: Each distinct value appears at most twice.
Input: ([1, 2, 3], 5)
Expected Output: (3, [1, 2, 3])
Explanation: If k is at least the array length, nothing is removed.
Input: ([0, 0, 0, 0, 1, 1, 1, 2], 2)
Expected Output: (5, [0, 0, 1, 1, 2])
Explanation: Runs longer than k are trimmed in-place.
Solution
def solution(nums, k):
if k <= 0:
return 0, []
write = 0
for read in range(len(nums)):
x = nums[read]
if write < k or nums[write - k] != x:
nums[write] = x
write += 1
return write, nums[:write]
Time complexity: O(n). Space complexity: O(1) auxiliary space, excluding the returned output slice used for testing.
Hints
- Use one pointer to read the original array and another pointer to write the kept values.
- Once you have already written k elements, compare the current value with the element k positions behind the write pointer.
Part 2: Sliding-Window Minimum Cover Substring
Constraints
- 0 <= len(s), len(t) <= 200000
- Characters may repeat in t and must be matched with multiplicity
- Expected solution should run in O(len(s) + len(t)) time
Examples
Input: ("ADOBECODEBANC", "ABC")
Expected Output: "BANC"
Explanation: Classic example: BANC is the shortest substring containing A, B, and C.
Input: ("AAABBC", "AABC")
Expected Output: "AABBC"
Explanation: t requires two As, one B, and one C.
Input: ("a", "aa")
Expected Output: ""
Explanation: No window can contain two copies of a.
Input: ("", "A")
Expected Output: ""
Explanation: Edge case: empty source string.
Input: ("é漢字é", "字é")
Expected Output: "字é"
Explanation: Works with Unicode code points as well.
Solution
from collections import Counter
def solution(s, t):
if not t or not s or len(t) > len(s):
return ""
need = Counter(t)
missing = len(t)
left = 0
best_start = 0
best_len = float("inf")
for right, ch in enumerate(s):
if ch in need:
if need[ch] > 0:
missing -= 1
need[ch] -= 1
while missing == 0:
window_len = right - left + 1
if window_len < best_len:
best_len = window_len
best_start = left
left_ch = s[left]
if left_ch in need:
need[left_ch] += 1
if need[left_ch] > 0:
missing += 1
left += 1
if best_len == float("inf"):
return ""
return s[best_start:best_start + best_len]
Time complexity: O(len(s) + len(t)). Space complexity: O(u), where u is the number of distinct characters required by t.
Hints
- Maintain character requirements from t and expand a right pointer until the window becomes valid.
- Once the window is valid, move the left pointer rightward while the window still covers t.
Part 3: String Sanitization with Whitespace Collapsing and Digit-Run Replacement
Constraints
- 0 <= len(s) <= 200000
- Digits to replace are ASCII only: '0' through '9'
- The algorithm should process the string in one left-to-right pass
Examples
Input: ("",)
Expected Output: ""
Explanation: Edge case: empty string stays empty.
Input: ("12345",)
Expected Output: "#"
Explanation: A full digit run becomes a single #.
Input: (" \tfoo 123\n\nbar45 baz\t",)
Expected Output: "foo # bar# baz"
Explanation: Leading/trailing whitespace is trimmed, inner whitespace collapses, and digit runs are replaced.
Input: (" \n\t ",)
Expected Output: ""
Explanation: Only whitespace becomes an empty string after trimming.
Input: ("😀 123\t\t👍42\n",)
Expected Output: "😀 # 👍#"
Explanation: Unicode characters are preserved; only ASCII digit runs and whitespace are transformed.
Solution
def solution(s):
out = []
pending_space = False
in_digit_run = False
for ch in s:
if ch.isspace():
if out:
pending_space = True
in_digit_run = False
continue
if pending_space and out:
out.append(' ')
pending_space = False
if '0' <= ch <= '9':
if not in_digit_run:
out.append('#')
in_digit_run = True
else:
out.append(ch)
in_digit_run = False
return ''.join(out)
Time complexity: O(n). Space complexity: O(n) for the returned string in Python; O(1) auxiliary working space excluding output.
Hints
- Do not emit a space immediately when you see whitespace; delay it until you know another non-whitespace character is coming.
- Track whether you are currently inside a digit run so you only append one '#'.