Determine minimal substitutions and removals
Company: Salesforce
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
##### Question
Given a list of words, determine the minimum number of character substitutions required so that, in every word, no two adjacent characters are identical.
Given an array of n unique integers, determine the minimum number of elements that must be removed so the array becomes "almost sorted"—meaning that after the removals, deleting at most one additional element would make the remaining array strictly ascending.
Quick Answer: This question evaluates string manipulation and adjacency-constraint handling in the first task and sequence analysis with minimal-removal and subsequence reasoning in the second, targeting competencies such as local transformation logic, subsequence recognition, and optimization under input constraints.
Given an array of n unique integers, return the minimum number of elements to remove so that the remaining array is almost sorted. An array is almost sorted if deleting at most one additional element from it would make it strictly increasing. Removals must preserve the original relative order (i.e., the remaining elements form a subsequence of the original).
Constraints
- 0 <= n <= 200000
- All nums[i] are pairwise distinct
- -10^9 <= nums[i] <= 10^9
- Aim for O(n log n) time and O(n) extra space
Solution
from bisect import bisect_left
from typing import List
def min_removals_almost_sorted(nums: List[int]) -> int:
# Compute length of LIS (strictly increasing) in O(n log n)
tails: List[int] = []
for x in nums:
i = bisect_left(tails, x)
if i == len(tails):
tails.append(x)
else:
tails[i] = x
L = len(tails)
n = len(nums)
# Max length of an almost-sorted subsequence is min(n, L + 1)
keep = min(n, L + 1)
return n - keep
Explanation
Let L be the LIS length of the original array. If a subsequence can be made strictly increasing by removing at most one element, then after that removal it is a strictly increasing subsequence of the original and thus has length at most L. Therefore the longest almost-sorted subsequence has length at most L+1. Conversely, you can take any LIS (length L) and add one extra element from the array (maintaining order) to obtain an almost-sorted subsequence of length L+1, since removing that extra element restores the LIS. Hence the minimum removals is n - min(n, L+1). We compute L using the patience sorting technique with binary search in O(n log n).
Time complexity: O(n log n). Space complexity: O(n).