Phone Keypad Predictive Text Suggestions
Company: xAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Onsite
# Phone Keypad Predictive Text Suggestions
You are building the predictive-text feature for a classic 9-key phone keypad. Each digit from 2 to 9 maps to a group of lowercase letters:
| Digit | Letters |
|-------|---------|
| 2 | a b c |
| 3 | d e f |
| 4 | g h i |
| 5 | j k l |
| 6 | m n o |
| 7 | p q r s |
| 8 | t u v |
| 9 | w x y z |
Digits `0` and `1` map to no letters.
A word's **digit encoding** is obtained by replacing each of its letters with the digit whose group contains that letter. For example, the word `"cat"` encodes to `"228"` and `"tree"` encodes to `"8733"`.
You are given a dictionary of distinct lowercase words and a list of queries. Each query is a non-empty string of digits (`'0'`–`'9'`) representing the keys a user has pressed so far. For each query, return **all dictionary words whose digit encoding starts with the query string** (i.e., the query is a prefix of the word's digit encoding), sorted in ascending lexicographic order. If no word matches, return an empty list for that query.
Implement:
```
suggestions(words: List[str], queries: List[str]) -> List[List[str]]
```
where the i-th element of the result is the sorted list of suggestions for `queries[i]`.
**Examples**
```
words = ["cat", "bat", "act", "car", "care", "dog"]
queries = ["228", "22", "2273", "364", "9"]
"228" -> ["act", "bat", "cat"] (all three encode exactly to "228")
"22" -> ["act", "bat", "car", "care", "cat"] ("22" is a prefix of "228", "227", "2273")
"2273" -> ["care"] ("care" encodes to "2273")
"364" -> ["dog"]
"9" -> []
```
Note that a query containing the digit `0` or `1` can never match any word, since no letter encodes to those digits.
**Constraints**
- `1 <= len(words) <= 10^5`
- `1 <= len(words[i]) <= 20`; words consist of lowercase English letters and are distinct
- `1 <= len(queries) <= 10^4`
- `1 <= len(queries[i]) <= 20`; queries consist of digit characters `'0'`–`'9'`
- The total number of words returned across all queries fits in memory (at most `10^6`)
Aim for a solution that preprocesses the dictionary once so that each query is answered without rescanning every word.
Quick Answer: This question evaluates understanding of string-to-digit encoding, prefix-based retrieval, and efficient preprocessing for fast query-time lookup in large dictionaries.
You are building the predictive-text feature for a classic 9-key phone keypad. Each digit from 2 to 9 maps to a group of lowercase letters:
| Digit | Letters |
|-------|---------|
| 2 | a b c |
| 3 | d e f |
| 4 | g h i |
| 5 | j k l |
| 6 | m n o |
| 7 | p q r s |
| 8 | t u v |
| 9 | w x y z |
Digits `0` and `1` map to no letters.
A word's **digit encoding** is obtained by replacing each of its letters with the digit whose group contains that letter. For example, `"cat"` encodes to `"228"` and `"tree"` encodes to `"8733"`.
You are given a dictionary of distinct lowercase words and a list of queries. Each query is a non-empty string of digits (`'0'`-`'9'`) representing the keys pressed so far. For each query, return **all dictionary words whose digit encoding starts with the query string** (the query is a prefix of the word's digit encoding), sorted in ascending lexicographic order. If no word matches, return an empty list for that query.
Implement `solution(words, queries)` returning a list of lists, where the i-th element is the sorted list of suggestions for `queries[i]`.
Note that a query containing the digit `0` or `1` can never match any word, since no letter encodes to those digits.
**Example**
```
words = ["cat", "bat", "act", "car", "care", "dog"]
queries = ["228", "22", "2273", "364", "9"]
"228" -> ["act", "bat", "cat"]
"22" -> ["act", "bat", "car", "care", "cat"]
"2273" -> ["care"]
"364" -> ["dog"]
"9" -> []
```
**Constraints**
- `1 <= len(words) <= 10^5`
- `1 <= len(words[i]) <= 20`; words are lowercase and distinct
- `1 <= len(queries) <= 10^4`
- `1 <= len(queries[i]) <= 20`; queries consist of digit characters `'0'`-`'9'`
- Total words returned across all queries fits in memory (at most `10^6`)
Aim to preprocess the dictionary once so each query is answered without rescanning every word.
Constraints
- 1 <= len(words) <= 10^5
- 1 <= len(words[i]) <= 20; words are lowercase English letters and distinct
- 1 <= len(queries) <= 10^4
- 1 <= len(queries[i]) <= 20; queries consist of digit characters '0'-'9'
- Total number of words returned across all queries is at most 10^6
Examples
Input: (["cat", "bat", "act", "car", "care", "dog"], ["228", "22", "2273", "364", "9"])
Expected Output: [['act', 'bat', 'cat'], ['act', 'bat', 'car', 'care', 'cat'], ['care'], ['dog'], []]
Explanation: cat/bat/act all encode to 228; car->227, care->2273, dog->364. '22' is a prefix of 228, 227 and 2273 so it matches all five of those words; '9' matches nothing.
Input: (["hi"], ["44"])
Expected Output: [['hi']]
Explanation: 'hi' encodes to '44' (h and i both map to digit 4), which exactly matches the query.
Input: (["cat"], ["0"])
Expected Output: [[]]
Explanation: The query '0' can never be a prefix of any encoding because no letter maps to digit 0.
Input: (["a"], ["22"])
Expected Output: [[]]
Explanation: 'a' encodes to '2' (length 1); the query '22' is longer than the encoding, so it is not a prefix and there is no match.
Input: (["zoo", "zap"], ["9", "92"])
Expected Output: [['zap', 'zoo'], ['zap']]
Explanation: zoo->966, zap->927. Both start with '9', sorted lexicographically as zap < zoo. Only zap starts with '92'.
Input: (["tree"], ["8733", "873", "8"])
Expected Output: [['tree'], ['tree'], ['tree']]
Explanation: 'tree' encodes to '8733'; each of '8', '873', and '8733' is a prefix of it, so 'tree' is suggested for all three queries.
Hints
- Precompute each word's digit encoding once using a fixed letter->digit map, then compare each query against those encodings.
- A match for query q means q is a prefix of the word's encoding: use string startswith / prefix comparison.
- Any query containing '0' or '1' matches nothing, since those digits have no letters.
- Sort each query's matched words lexicographically before returning. For large inputs, a trie keyed on digits (with words stored at reachable nodes) lets you answer each query by walking the query and collecting the subtree.