Select next Hangman letter
Company: Moveworks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates algorithm design and string-processing skills, specifically pattern filtering, frequency analysis across candidate words, set operations for excluding guessed letters, deterministic tie-breaking, and complexity reasoning.
Constraints
- 0 <= len(pattern) <= 30
- 0 <= len(dictionary) <= 50000
- Each dictionary word has length at most 30
- pattern contains only lowercase English letters and '_'
- guessed contains only lowercase English letters
- dictionary words contain only lowercase English letters
Examples
Input: ('_pp_e', {'p', 'e'}, ['apple', 'apply', 'ample', 'spore', 'angle'])
Expected Output: 'a'
Explanation: Only 'apple' matches the pattern. The remaining unguessed letters are 'a' and 'l', each with frequency 1, so the alphabetically smaller 'a' is chosen.
Input: ('_a__', {'a', 'e', 'r'}, ['ball', 'mass', 'taco', 'lava', 'warm', 'ally', 'beta'])
Expected Output: 'l'
Explanation: The candidates are 'ball', 'mass', and 'taco'. Words with hidden 'a', 'e', or 'r' in unknown positions are excluded. Frequencies include l=2 and s=2, so 'l' wins by alphabetical tie-break.
Input: ('__', [], ['ab', 'ba'])
Expected Output: 'a'
Explanation: Both words are candidates. 'a' and 'b' each occur twice, so the alphabetically smaller letter 'a' is returned.
Input: ('a__a', {'a'}, ['alga', 'anna', 'area', 'abca', 'aaaa'])
Expected Output: 'n'
Explanation: 'aaaa' is invalid because already-guessed 'a' appears in unknown positions. Among the valid candidates, 'n' appears twice in 'anna', more than any other unguessed letter.
Input: ('a__', {'a', 'b'}, ['bbb', 'ccc', 'def'])
Expected Output: ''
Explanation: No dictionary word is consistent with the revealed leading 'a', so there is no useful guess.
Input: ('cat', {'c', 'a', 't'}, ['cat', 'car', 'cut'])
Expected Output: ''
Explanation: The word is already fully revealed. Even though 'cat' is a candidate, no unguessed letter appears in it.
Input: ('_____', {'e', 's'}, ['apple', 'berry', 'civic', 'cocoa', 'daddy', 'tests'])
Expected Output: 'c'
Explanation: Words containing already-guessed absent letters 'e' or 's' are excluded. The candidates are 'civic', 'cocoa', and 'daddy'. Letter 'c' appears 4 times, the highest aggregate frequency.
Hints
- When a pattern position is unknown, it cannot contain any letter that has already been guessed; otherwise that letter would either be revealed or known absent.
- You do not need to store all candidate words. Filter each word and update a 26-element frequency table as you go.