Validate normalized palindromes with variants
Company: xAI
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates string-processing skills including Unicode normalization, case-folding and diacritics handling, streaming input processing, error-tolerant palindrome logic, and analysis of time/space trade-offs.
Part 1: Basic normalized palindrome
Constraints
- 0 <= len(s) <= 200000
- s may contain letters, digits, spaces, punctuation, and Unicode characters
- Use Unicode-aware alphanumeric checks and case folding
Examples
Input: ("A man, a plan, a canal: Panama!",)
Expected Output: True
Explanation: Normalization gives 'amanaplanacanalpanama', which is a palindrome.
Input: ("race a car",)
Expected Output: False
Explanation: Normalization gives 'raceacar', which is not a palindrome.
Input: ("",)
Expected Output: True
Explanation: The empty string stays empty and counts as a palindrome.
Input: ("0P",)
Expected Output: False
Explanation: Normalization gives '0p', which is not the same forward and backward.
Solution
def solution(s):
normalized = []
for ch in s:
if ch.isalnum():
for fch in ch.casefold():
if fch.isalnum():
normalized.append(fch)
return normalized == normalized[::-1]
Time complexity: O(n). Space complexity: O(n).
Hints
- Build the normalized sequence first: keep only alphanumeric characters and apply case folding.
- After normalization, palindrome checking is just a compare against the reversed sequence.
Part 2: Unicode-normalized palindrome with diacritics stripping and locale rules
Constraints
- 0 <= len(s) <= 100000
- locale_code is one of 'default', 'tr', or 'az'
- Use NFKD normalization and strip combining marks with Unicode-aware logic
Examples
Input: ("Noël, Léon", "default")
Expected Output: True
Explanation: After case folding, NFKD normalization, and diacritics stripping, the string becomes 'noelleon', which is a palindrome.
Input: ("I, ı", "tr")
Expected Output: True
Explanation: In Turkish, 'I' maps to 'ı', so the normalized form is 'ıı'.
Input: ("I, ı", "default")
Expected Output: False
Explanation: Without Turkish rules, the normalized form is effectively 'iı', which is not a palindrome.
Input: ("Résumé", "default")
Expected Output: False
Explanation: The transformed form is 'resume', which is not a palindrome.
Input: ("!!!", "default")
Expected Output: True
Explanation: No alphanumeric characters remain, so the normalized string is empty.
Solution
import unicodedata
def solution(s, locale_code):
if locale_code in ('tr', 'az'):
mapping = {'I': 'ı', 'İ': 'i'}
s = ''.join(mapping.get(ch, ch) for ch in s)
folded = s.casefold()
decomposed = unicodedata.normalize('NFKD', folded)
normalized = []
for ch in decomposed:
if unicodedata.combining(ch):
continue
if ch.isalnum():
normalized.append(ch)
return normalized == normalized[::-1]
Time complexity: O(n). Space complexity: O(n).
Hints
- The Turkish/Azeri dotted-I rule must be handled before generic case folding, otherwise 'I' and 'İ' lose their distinction.
- Use unicodedata.normalize('NFKD', ...) and skip characters where unicodedata.combining(ch) is nonzero.
Part 3: Streaming normalized palindrome with one pass and O(1) extra space
Constraints
- The total number of characters across all chunks can be very large
- Each chunk must be processed in order and should not require storing the full normalized string
- Use O(1) extra space beyond a constant number of integers
Examples
Input: (["A man, ", "a plan,", " a canal: Panama"],)
Expected Output: True
Explanation: The normalized stream is 'amanaplanacanalpanama'.
Input: (["race", " a car"],)
Expected Output: False
Explanation: The normalized stream is 'raceacar', which is not a palindrome.
Input: (["!!!", "###"],)
Expected Output: True
Explanation: No alphanumeric characters remain after normalization.
Input: (["0", "P"],)
Expected Output: False
Explanation: The normalized stream is '0p'.
Input: (["ab", "", "ba"],)
Expected Output: True
Explanation: Empty chunks should be handled naturally; the normalized stream is 'abba'.
Solution
def solution(chunks):
base = 911382323
mod1 = 1000000007
mod2 = 1000000009
forward1 = 0
forward2 = 0
reverse1 = 0
reverse2 = 0
power1 = 1
power2 = 1
for chunk in chunks:
for ch in chunk:
if not ch.isalnum():
continue
for fch in ch.casefold():
if not fch.isalnum():
continue
x = ord(fch) + 1
forward1 = (forward1 * base + x) % mod1
forward2 = (forward2 * base + x) % mod2
reverse1 = (reverse1 + x * power1) % mod1
reverse2 = (reverse2 + x * power2) % mod2
power1 = (power1 * base) % mod1
power2 = (power2 * base) % mod2
return forward1 == reverse1 and forward2 == reverse2
Time complexity: O(n). Space complexity: O(1).
Hints
- Maintain both a forward polynomial hash and a reverse-position polynomial hash as characters arrive.
- Use two different moduli to make collisions extremely unlikely.
Part 4: K-deletion normalized palindrome
Constraints
- 0 <= len(s) <= 2000 after normalization
- 0 <= k <= 2000
- An O(n^2) dynamic programming solution is expected
Examples
Input: ("abca", 1)
Expected Output: True
Explanation: Deleting 'b' or 'c' produces a palindrome.
Input: ("abc", 1)
Expected Output: False
Explanation: At least two deletions are needed.
Input: ("A man, a plan, a canal: Panama", 0)
Expected Output: True
Explanation: It is already a palindrome after normalization.
Input: ("deeee", 1)
Expected Output: True
Explanation: Deleting 'd' leaves 'eeee', which is a palindrome.
Input: ("!!!", 2)
Expected Output: True
Explanation: The normalized string is empty.
Solution
def solution(s, k):
normalized = []
for ch in s:
if ch.isalnum():
for fch in ch.casefold():
if fch.isalnum():
normalized.append(fch)
n = len(normalized)
if n <= 1:
return True
if k >= n:
return True
dp = [0] * n
for i in range(n - 2, -1, -1):
prev = 0
for j in range(i + 1, n):
temp = dp[j]
if normalized[i] == normalized[j]:
dp[j] = prev
else:
dp[j] = 1 + min(dp[j], dp[j - 1])
prev = temp
return dp[n - 1] <= k
Time complexity: O(n^2). Space complexity: O(n).
Hints
- Think in terms of the minimum number of deletions needed to make a substring a palindrome.
- Equivalent viewpoint: minimum deletions = length - longest palindromic subsequence.
Part 5: Return the first mismatch index pair
Constraints
- 0 <= len(s) <= 200000
- Indices in the answer refer to positions in the original input string
- Use Unicode-aware alphanumeric checks and case folding
Examples
Input: ("ab!ca",)
Expected Output: (1, 3)
Explanation: Normalized form is 'abca'; the first mismatch is 'b' vs 'c', from original indices 1 and 3.
Input: ("A man, a plan, a canal: Panama!",)
Expected Output: (-1, -1)
Explanation: The normalized string is a palindrome.
Input: ("0P",)
Expected Output: (0, 1)
Explanation: The normalized form is '0p', so the first comparison already mismatches.
Input: ("!!!",)
Expected Output: (-1, -1)
Explanation: There are no alphanumeric characters after normalization.
Input: ("abc!da",)
Expected Output: (1, 4)
Explanation: Normalized form is 'abcda'; after matching 'a' with 'a', the first mismatch is 'b' vs 'd'.
Solution
def solution(s):
normalized = []
original_indices = []
for idx, ch in enumerate(s):
if ch.isalnum():
for fch in ch.casefold():
if fch.isalnum():
normalized.append(fch)
original_indices.append(idx)
left = 0
right = len(normalized) - 1
while left < right:
if normalized[left] != normalized[right]:
return (original_indices[left], original_indices[right])
left += 1
right -= 1
return (-1, -1)
Time complexity: O(n). Space complexity: O(n).
Hints
- Store both the normalized characters and the original index each normalized character came from.
- Then run a normal two-pointer palindrome check on the normalized list and return the mapped original indices at the first mismatch.