How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a Medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Palo Alto Networks.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Palo Alto Networks during technical interviews.

Find normalized list difference efficiently | Palo Alto Networks Coding Question

Find normalized list difference efficiently

Company: Palo Alto Networks

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

Given two lists of strings, list1 and list2, return all elements that appear in list1 but not in list2 after applying a consistent normalization step (e.g., lowercase and trimming). Describe and implement an efficient algorithm, including the data structures you would use, handling of duplicates, and the time and space complexities.

Quick Answer: This question evaluates competency in computing normalized list differences, including string normalization, collection comparison, duplicate handling, and time and space complexity reasoning.

You are given two lists of strings, list1 and list2. Normalize every string by removing leading and trailing whitespace and converting it to lowercase. Return all normalized strings that appear in list1 but not in list2. Rules for duplicates: - If the same normalized string appears multiple times in list1, include it only once in the output. - Preserve the order of first appearance from list1. - Duplicates in list2 do not matter. For example, " Apple ", "apple", and "APPLE" all normalize to "apple".

Constraints

0 <= len(list1), len(list2) <= 100000
0 <= len(s) <= 1000 for every string s in list1 and list2
Normalization is exactly: s.strip().lower()

Examples

Input: ([' Apple ', 'banana', 'BANANA', 'Cherry'], ['apple', ' Durian '])

Expected Output: ['banana', 'cherry']

Explanation: After normalization, list1 becomes ['apple', 'banana', 'banana', 'cherry'] and list2 becomes {'apple', 'durian'}. The values in list1 but not list2 are 'banana' and 'cherry', with duplicates removed.

Input: ([' one', 'Two', ' two ', 'THREE', 'three ', 'Four'], ['TWO'])

Expected Output: ['one', 'three', 'four']

Explanation: The normalized value 'two' is excluded because it appears in list2. 'three' appears twice in list1 after normalization but is included only once.

Input: ([], ['a', 'b'])

Expected Output: []

Explanation: An empty list1 produces an empty result.

Input: (['x', ' X ', 'y', 'Z'], ['z', ' z ', 'Q'])

Expected Output: ['x', 'y']

Explanation: After normalization, list2 contains 'z' and 'q'. 'x' is included once despite appearing twice in list1 after normalization, 'y' is included, and 'z' is excluded.

Input: ([' Mango', 'mango ', 'Peach'], [])

Expected Output: ['mango', 'peach']

Explanation: With an empty list2, every unique normalized value from list1 is returned in order.

Solution

def solution(list1, list2):
    def normalize(s):
        return s.strip().lower()

    excluded = {normalize(s) for s in list2}
    seen = set()
    result = []

    for s in list1:
        normalized = normalize(s)
        if normalized not in excluded and normalized not in seen:
            seen.add(normalized)
            result.append(normalized)

    return result

Time complexity: O(total characters in list1 and list2). Space complexity: O(u + v), where u is the number of unique normalized strings in list2 and v is the number of unique strings added to the result.

Hints

Normalize all strings in list2 first and store them in a hash set for fast membership checks.
Use another set to track which normalized strings from list1 have already been added to the result so you can avoid duplicates while preserving order.

Quick Overview

This question evaluates competency in computing normalized list differences, including string normalization, collection comparison, duplicate handling, and time and space complexity reasoning.