PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This multi-part question evaluates algorithmic problem-solving skills across two-pointer in-place array manipulation, sliding-window substring optimization, and single-pass string sanitization with attention to Unicode, grapheme clusters, and space-efficient in-place transformations.

  • Medium
  • Microsoft
  • Coding & Algorithms
  • Data Scientist

Solve two-pointer, sliding-window, and string tasks

Company: Microsoft

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Solve the following three coding tasks: 1) Two-pointer in-place de-duplication: Given a non-decreasing integer array nums and an integer k >= 1, modify nums in-place so that each distinct value appears at most k times. Return the new length L and ensure nums[0..L-1] holds the resulting sequence. Requirements: O(n) time, O(1) extra space. Prove correctness (loop invariants) and discuss off-by-one pitfalls. Test cases: [], k=1 -> L=0; [2,2,2], k=1 -> [2]; [1,1,1,2,2,3], k=2 -> prefix [1,1,2,2,3]; k >= n; array with many distinct values. 2) Sliding-window minimum cover: Given strings s and t, return the shortest substring of s that contains every character of t with multiplicity. If multiple, return the leftmost. Achieve O(|s| + |t|) time and O(Σ alphabet) space. Be Unicode-aware by code points; explain how you’d adapt if the alphabet is very large or if you must treat grapheme clusters. Example: s = "ADOBECODEBANC", t = "ABC" -> "BANC". Provide adversarial tests (e.g., repeated chars in t, no solution). 3) Non-LC style string processing: Implement sanitize(s) that (a) trims leading/trailing whitespace, (b) collapses any run of whitespace (spaces/tabs/newlines) into a single space, and (c) replaces every maximal run of ASCII digits [0-9]+ with a single token "#" while leaving all other characters unchanged. Do it in one pass O(n) and O(1) extra space (in-place if the language allows). Discuss pitfalls with multi-byte Unicode, combining marks, and surrogate pairs. Include unit tests that cover edge cases (empty string, only digits, mixed emoji, long whitespace runs).

Quick Answer: This multi-part question evaluates algorithmic problem-solving skills across two-pointer in-place array manipulation, sliding-window substring optimization, and single-pass string sanitization with attention to Unicode, grapheme clusters, and space-efficient in-place transformations.

Part 1: Two-Pointer In-Place De-duplication

You are given a non-decreasing integer array nums and an integer k >= 1. Rewrite the array in-place so that each distinct value appears at most k times. Keep the relative order of the remaining elements the same. For easier automated testing, return a tuple (L, nums[:L]) after performing the in-place rewrite, where L is the new logical length.

Constraints

  • 0 <= len(nums) <= 200000
  • -10^9 <= nums[i] <= 10^9
  • nums is sorted in non-decreasing order
  • 1 <= k <= 200000

Examples

Input: ([], 1)

Expected Output: (0, [])

Explanation: Edge case: empty input stays empty.

Input: ([2, 2, 2], 1)

Expected Output: (1, [2])

Explanation: Only one copy of 2 may remain.

Input: ([1, 1, 1, 2, 2, 3], 2)

Expected Output: (5, [1, 1, 2, 2, 3])

Explanation: Each distinct value appears at most twice.

Input: ([1, 2, 3], 5)

Expected Output: (3, [1, 2, 3])

Explanation: If k is at least the array length, nothing is removed.

Input: ([0, 0, 0, 0, 1, 1, 1, 2], 2)

Expected Output: (5, [0, 0, 1, 1, 2])

Explanation: Runs longer than k are trimmed in-place.

Solution

def solution(nums, k):
    if k <= 0:
        return 0, []

    write = 0
    for read in range(len(nums)):
        x = nums[read]
        if write < k or nums[write - k] != x:
            nums[write] = x
            write += 1

    return write, nums[:write]

Time complexity: O(n). Space complexity: O(1) auxiliary space, excluding the returned output slice used for testing.

Hints

  1. Use one pointer to read the original array and another pointer to write the kept values.
  2. Once you have already written k elements, compare the current value with the element k positions behind the write pointer.

Part 2: Sliding-Window Minimum Cover Substring

Given strings s and t, return the shortest substring of s that contains every character of t with multiplicity. If multiple substrings have the same minimum length, return the leftmost one. If no such substring exists, return an empty string. Assume Python string iteration is by Unicode code point.

Constraints

  • 0 <= len(s), len(t) <= 200000
  • Characters may repeat in t and must be matched with multiplicity
  • Expected solution should run in O(len(s) + len(t)) time

Examples

Input: ("ADOBECODEBANC", "ABC")

Expected Output: "BANC"

Explanation: Classic example: BANC is the shortest substring containing A, B, and C.

Input: ("AAABBC", "AABC")

Expected Output: "AABBC"

Explanation: t requires two As, one B, and one C.

Input: ("a", "aa")

Expected Output: ""

Explanation: No window can contain two copies of a.

Input: ("", "A")

Expected Output: ""

Explanation: Edge case: empty source string.

Input: ("é漢字é", "字é")

Expected Output: "字é"

Explanation: Works with Unicode code points as well.

Solution

from collections import Counter

def solution(s, t):
    if not t or not s or len(t) > len(s):
        return ""

    need = Counter(t)
    missing = len(t)
    left = 0
    best_start = 0
    best_len = float("inf")

    for right, ch in enumerate(s):
        if ch in need:
            if need[ch] > 0:
                missing -= 1
            need[ch] -= 1

        while missing == 0:
            window_len = right - left + 1
            if window_len < best_len:
                best_len = window_len
                best_start = left

            left_ch = s[left]
            if left_ch in need:
                need[left_ch] += 1
                if need[left_ch] > 0:
                    missing += 1
            left += 1

    if best_len == float("inf"):
        return ""
    return s[best_start:best_start + best_len]

Time complexity: O(len(s) + len(t)). Space complexity: O(u), where u is the number of distinct characters required by t.

Hints

  1. Maintain character requirements from t and expand a right pointer until the window becomes valid.
  2. Once the window is valid, move the left pointer rightward while the window still covers t.

Part 3: String Sanitization with Whitespace Collapsing and Digit-Run Replacement

Implement sanitize(s) in one pass. It must: (1) trim leading and trailing whitespace, (2) collapse every maximal run of whitespace into a single space, and (3) replace every maximal run of ASCII digits [0-9]+ with a single '#'. All other characters must remain unchanged. For this problem, whitespace means any character c such that c.isspace() is True in Python.

Constraints

  • 0 <= len(s) <= 200000
  • Digits to replace are ASCII only: '0' through '9'
  • The algorithm should process the string in one left-to-right pass

Examples

Input: ("",)

Expected Output: ""

Explanation: Edge case: empty string stays empty.

Input: ("12345",)

Expected Output: "#"

Explanation: A full digit run becomes a single #.

Input: (" \tfoo 123\n\nbar45 baz\t",)

Expected Output: "foo # bar# baz"

Explanation: Leading/trailing whitespace is trimmed, inner whitespace collapses, and digit runs are replaced.

Input: (" \n\t ",)

Expected Output: ""

Explanation: Only whitespace becomes an empty string after trimming.

Input: ("😀 123\t\t👍42\n",)

Expected Output: "😀 # 👍#"

Explanation: Unicode characters are preserved; only ASCII digit runs and whitespace are transformed.

Solution

def solution(s):
    out = []
    pending_space = False
    in_digit_run = False

    for ch in s:
        if ch.isspace():
            if out:
                pending_space = True
            in_digit_run = False
            continue

        if pending_space and out:
            out.append(' ')
            pending_space = False

        if '0' <= ch <= '9':
            if not in_digit_run:
                out.append('#')
                in_digit_run = True
        else:
            out.append(ch)
            in_digit_run = False

    return ''.join(out)

Time complexity: O(n). Space complexity: O(n) for the returned string in Python; O(1) auxiliary working space excluding output.

Hints

  1. Do not emit a space immediately when you see whitespace; delay it until you know another non-whitespace character is coming.
  2. Track whether you are currently inside a digit run so you only append one '#'.
Last updated: Apr 28, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Return Top K Open Businesses - Microsoft (hard)
  • Implement Memory Allocation and In-Memory Records - Microsoft (medium)
  • Implement K-Means and Detect Divisible Subarrays - Microsoft (medium)
  • Sort Three Categories In Place - Microsoft (medium)
  • Retain Top K Elements - Microsoft (medium)