Find Words Containing Other Words
Company: Meta
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This question evaluates string-processing skills, knowledge of substring matching techniques and supporting data structures, and the ability to analyze and improve time and space complexity.
Constraints
- 0 <= len(words) <= 20000
- 1 <= len(words[i]) <= 200
- All strings consist only of lowercase English letters
- The sum of lengths of all strings is at most 200000
Examples
Input: (['category', 'cat', 'dog'],)
Expected Output: ['category']
Explanation: `category` contains `cat`, while `cat` and `dog` do not contain any other listed word.
Input: (['mass', 'as', 'hero', 'superhero', 'her'],)
Expected Output: ['mass', 'hero', 'superhero']
Explanation: `mass` contains `as`, `hero` contains `her`, and `superhero` contains both `hero` and `her`.
Input: (['cat', 'cat', 'dog'],)
Expected Output: ['cat']
Explanation: The second `cat` counts as another occurrence, so `cat` qualifies, but it should appear only once in the output.
Input: ([],)
Expected Output: []
Explanation: An empty input list produces an empty output list.
Input: (['alone'],)
Expected Output: []
Explanation: A single occurrence of a word does not count as containing itself.
Input: (['abc', 'bc', 'c', 'abc'],)
Expected Output: ['abc', 'bc']
Explanation: `abc` contains `bc` and `c`, `bc` contains `c`, and the duplicate `abc` does not create another output entry.
Hints
- The slow part of the baseline is re-checking all candidate substrings for every word. Can you preprocess the full dictionary once, then scan each word in a single pass?
- A trie with failure links can match many patterns inside one text efficiently.