Find normalized list difference efficiently
Company: Palo Alto Networks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates competency in computing normalized list differences, including string normalization, collection comparison, duplicate handling, and time and space complexity reasoning.
Constraints
- 0 <= len(list1), len(list2) <= 100000
- 0 <= len(s) <= 1000 for every string s in list1 and list2
- Normalization is exactly: s.strip().lower()
Examples
Input: ([' Apple ', 'banana', 'BANANA', 'Cherry'], ['apple', ' Durian '])
Expected Output: ['banana', 'cherry']
Explanation: After normalization, list1 becomes ['apple', 'banana', 'banana', 'cherry'] and list2 becomes {'apple', 'durian'}. The values in list1 but not list2 are 'banana' and 'cherry', with duplicates removed.
Input: ([' one', 'Two', ' two ', 'THREE', 'three ', 'Four'], ['TWO'])
Expected Output: ['one', 'three', 'four']
Explanation: The normalized value 'two' is excluded because it appears in list2. 'three' appears twice in list1 after normalization but is included only once.
Input: ([], ['a', 'b'])
Expected Output: []
Explanation: An empty list1 produces an empty result.
Input: (['x', ' X ', 'y', 'Z'], ['z', ' z ', 'Q'])
Expected Output: ['x', 'y']
Explanation: After normalization, list2 contains 'z' and 'q'. 'x' is included once despite appearing twice in list1 after normalization, 'y' is included, and 'z' is excluded.
Input: ([' Mango', 'mango ', 'Peach'], [])
Expected Output: ['mango', 'peach']
Explanation: With an empty list2, every unique normalized value from list1 is returned in order.
Solution
def solution(list1, list2):
def normalize(s):
return s.strip().lower()
excluded = {normalize(s) for s in list2}
seen = set()
result = []
for s in list1:
normalized = normalize(s)
if normalized not in excluded and normalized not in seen:
seen.add(normalized)
result.append(normalized)
return resultTime complexity: O(total characters in list1 and list2). Space complexity: O(u + v), where u is the number of unique normalized strings in list2 and v is the number of unique strings added to the result.
Hints
- Normalize all strings in list2 first and store them in a hash set for fast membership checks.
- Use another set to track which normalized strings from list1 have already been added to the result so you can avoid duplicates while preserving order.