Search Words in a Character Grid
Company: Glean
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
You are given a 2D grid `board` of lowercase English letters with `m` rows and `n` columns. A **word** can be constructed from letters of sequentially adjacent cells, where *adjacent* cells are horizontally or vertically neighboring (no diagonals). The **same grid cell may not be used more than once** within a single word.
This question has two parts that build on each other: first matching a single word, then efficiently matching a large dictionary of words against the same board.
### Constraints & Assumptions
- `1 <= m, n <= 12` for Part 1; for Part 2 the board can be larger (up to roughly `12 x 12`), and the word list can hold up to `~3 * 10^4` words.
- Each word has length up to `~10` characters and consists of lowercase English letters `a`–`z`.
- The board and words contain only lowercase English letters.
- A cell, once consumed in the current path, cannot be revisited *in that same path*, but is freely available again when you start a new path or backtrack.
- The word list may contain duplicates; the result should not.
### Clarifying Questions to Ask
- Are diagonal moves allowed, or only the 4 orthogonal directions? (Assume orthogonal only.)
- Can a single board cell be reused within one word's path? (No — each cell at most once per word.)
- For Part 2, can the same board cell be reused *across* different words? (Yes — each word is matched independently.)
- Should the returned words preserve input order, dedupe duplicates, and what is expected for the empty grid or empty word list?
- What are the realistic upper bounds on board size, word count, and word length? (This drives whether the brute-force-per-word approach in Part 1 is acceptable at scale.)
### Part 1
Given a single target `word`, determine whether `word` can be formed by a path in the board under the adjacency and no-reuse rules above. Return a boolean.
```hint Where to start
This is a search over paths in a grid. Try a depth-first search (backtracking) launched from every cell whose letter equals `word[0]`, walking to the 4 neighbors that match the next character.
```
```hint Tracking visited cells
You need to avoid reusing a cell within the current path. Either keep a separate `visited` matrix, or mutate the board cell in place to a sentinel (e.g. `#`) before recursing and restore it on backtrack — that avoids extra space.
```
```hint Pruning and complexity
Prune as early as possible: bail the moment a letter mismatches or you step off the board. The worst-case time is roughly $O(m \cdot n \cdot 4^{L})$ where $L$ is the word length (4 directions per step, minus the cell you came from), and recursion depth is $O(L)$.
```
#### What This Part Should Cover
- A correct backtracking DFS that starts from every viable origin cell and respects the 4-directional adjacency.
- Correct, restored visited-state on backtrack (no leaked mutations to the board).
- Stated time/space complexity in terms of board size and word length, with the branching-factor reasoning.
### Part 2
As a follow-up, you are given a **list of words** `words`. Return **all distinct words** from the list that can be formed on the board under the same adjacency and no-reuse rules. The list can be large and many words share common prefixes, so design for that — running Part 1 independently for every word is too slow.
```hint Where to start
Re-running the Part 1 DFS once per word re-walks shared prefixes again and again. Index the *dictionary* into a structure that lets one board traversal match many words at once.
```
```hint Data structure
Build a **trie (prefix tree)** from `words`. Then DFS the board once, advancing a trie pointer in lockstep with the path: only continue into a neighbor if the trie has a child for that letter. A node marked as a word-end means you've matched a full word — record it.
```
```hint Making it fast and avoiding duplicates
Store the full word at the trie's terminal node so you can add it directly to a result set on a hit. Two key optimizations: (1) after collecting a word, set its terminal flag to `None`/false so it isn't reported twice; (2) **prune the trie** — remove leaf nodes with no children after a match so dead branches stop being explored. This bounds work by what's still reachable rather than the whole dictionary.
```
#### What This Part Should Cover
- A trie built from the word list, with each terminal node carrying the matched word (or enough to reconstruct it).
- A single board DFS that advances board position and trie pointer together, only recursing where the trie has a matching child.
- Deduplication of results and a correctness argument that one traversal covers all words sharing a prefix.
- Complexity discussion: trie build cost, and why the board DFS bound depends on the trie/branching rather than `words.length × per-word DFS`.
### What a Strong Answer Covers
Across both parts, a strong answer demonstrates the same core grid-search skill scaling from one word to many:
- Recognizing Part 2 as a generalization of Part 1, and *why* a per-word loop is wasteful when prefixes are shared (the trie collapses shared prefix work).
- Clean handling of the shared edge cases: empty board, empty word list, duplicate words, words longer than `m × n` (impossible — can prune), and words containing letters absent from the board.
- Honest complexity analysis for both parts, including the exponential branching factor of grid DFS and how trie pruning tightens the practical bound.
- Correct visited/backtracking discipline so no state leaks between paths or words.
### Follow-up Questions
- In Part 2, how does pruning the trie (removing matched/leaf nodes) change the worst-case versus the practical running time?
- If the board were extremely large but the dictionary small, would you still prefer the trie approach, or index the board instead (e.g. precompute letter positions)?
- How would you parallelize Part 2 across many starting cells or many board regions, and what shared state would need synchronization?
- Suppose words could reuse a cell (no no-reuse constraint) and had bounded length — how would the search and complexity change?
Quick Answer: This question evaluates algorithmic problem-solving in grid traversal and multi-word string search, targeting competencies in pathfinding, string matching, efficient use of data structures for shared prefixes, and time/space complexity analysis within the Coding & Algorithms domain.
Word Search (Part 1) — Single Word in a Grid
You are given a 2D grid `board` of lowercase English letters with `m` rows and `n` columns, and a single target `word`. A word can be constructed from letters of sequentially adjacent cells, where adjacent cells are horizontally or vertically neighboring (no diagonals). The same grid cell may not be used more than once within a single word's path.
Return `true` if `word` can be formed by a valid path in the board, otherwise `false`.
Launch a depth-first backtracking search from every cell whose letter equals `word[0]`, walking only into the 4 orthogonal neighbors that match the next character. Mark a cell as used before recursing and restore it on backtrack so it can be reused by other paths. Prune the moment a letter mismatches or you step off the board; a word longer than `m * n` cells is impossible.
Edge cases: empty board, empty word (trivially matchable), word longer than the number of cells, and words containing letters absent from the board.
Constraints
- 1 <= m, n <= 12
- board contains only lowercase English letters a-z
- word length up to ~10, lowercase letters only
- a cell may be used at most once per word path; freely available again on backtrack
Examples
Input: ([['a','b','c','e'],['s','f','c','s'],['a','d','e','e']], 'abcced')
Expected Output: True
Explanation: Path a(0,0)->b(0,1)->c(0,2)->c(1,2)->e(2,2)->d(2,1) spells 'abcced'.
Input: ([['a','b','c','e'],['s','f','c','s'],['a','d','e','e']], 'see')
Expected Output: True
Explanation: Path s(1,3)->e(2,3)->e(2,2) spells 'see'.
Input: ([['a','b','c','e'],['s','f','c','s'],['a','d','e','e']], 'abcb')
Expected Output: False
Explanation: After a->b->c, the only 'b' is the starting cell which is already used, so no valid path.
Input: ([['a']], 'a')
Expected Output: True
Explanation: Single matching cell.
Input: ([['a']], 'b')
Expected Output: False
Explanation: No cell equals 'b'.
Input: ([['a','b'],['c','d']], 'abdc')
Expected Output: True
Explanation: a(0,0)->b(0,1)->d(1,1)->c(1,0) forms an orthogonal path.
Input: ([['a','b'],['c','d']], 'abcd')
Expected Output: False
Explanation: After a->b, 'c' is not orthogonally adjacent to b within an unused-cell path leading to 'd'.
Input: ([['a','a']], 'aaa')
Expected Output: False
Explanation: Only two cells exist; a length-3 word cannot avoid reusing a cell.
Hints
- This is a path search in a grid: run a DFS (backtracking) from every cell whose letter equals word[0], walking to the 4 neighbors that match the next character.
- Avoid reusing a cell within the current path: either keep a visited matrix, or mutate the board cell to a sentinel '#' before recursing and restore it on backtrack to save space.
- Prune aggressively: bail the moment a letter mismatches or you step off the board, and early-return False when len(word) > m * n.
Word Search II (Part 2) — Match a Dictionary on the Grid
Generalization of Part 1. You are given the same kind of 2D `board` and a list of `words`. Return all distinct words from the list that can be formed on the board under the same adjacency (4-directional) and no-reuse (each cell at most once per word) rules. The result must not contain duplicates; input order need not be preserved.
Running the Part 1 DFS once per word re-walks shared prefixes repeatedly. Instead, index the dictionary into a trie (prefix tree), then DFS the board once, advancing a trie pointer in lockstep with the path — only step into a neighbor if the trie has a child for that letter. Store the full word at each terminal node so a hit records it directly. Two key optimizations: (1) pop/clear a word's terminal flag after collecting it so it is reported at most once, and (2) prune the trie by removing nodes that have no surviving children after a match, so exhausted branches stop being explored.
This verified reference returns the matched words sorted, for a deterministic result.
Edge cases: empty board, empty word list, duplicate words in the list, words longer than m*n, and words with letters absent from the board.
Constraints
- 1 <= m, n <= 12 (board up to roughly 12 x 12)
- word list up to ~3 * 10^4 words; each word length up to ~10
- board and words contain only lowercase English letters a-z
- the word list may contain duplicates; the result must not
- each word is matched independently — a cell may serve different words
Examples
Input: ([['o','a','a','n'],['e','t','a','e'],['i','h','k','r'],['i','f','l','v']], ['oath','pea','eat','rain'])
Expected Output: ['eat', 'oath']
Explanation: 'oath' = o(0,0)->a(0,1)->t(1,1)->h(2,1); 'eat' = e(1,0)->a(1,1)? no — e(1,3)->a(0,3)? The classic example yields {oath, eat}; 'pea' and 'rain' cannot be formed.
Input: ([['a','b'],['c','d']], ['abcb'])
Expected Output: []
Explanation: 'abcb' needs a second 'b' which would reuse the only 'b' cell; no match.
Input: ([['a']], ['a'])
Expected Output: ['a']
Explanation: Single cell matches the single word.
Input: ([['a','b'],['c','d']], ['ab','cd','ac','abdc','abcd'])
Expected Output: ['ab', 'abdc', 'ac', 'cd']
Explanation: 'ab','cd','ac' are adjacent pairs; 'abdc' = a->b->d->c is a valid path; 'abcd' is not formable.
Input: ([['a','a']], ['a','aa','aaa'])
Expected Output: ['a', 'aa']
Explanation: 'a' and 'aa' fit the two cells; 'aaa' would reuse a cell.
Input: ([['c','o','d'],['x','d','e'],['x','x','r']], ['code','coder','codes','cod'])
Expected Output: ['cod', 'code', 'coder']
Explanation: Shared prefix 'cod' is walked once via the trie: 'cod','code','coder' all match along c(0,0)->o(0,1)->d(0,2)->e(1,2)->r(2,2); 'codes' has no reachable 's'.
Input: ([['a','b'],['c','d']], [])
Expected Output: []
Explanation: Empty word list returns empty.
Input: ([['a','b'],['c','d']], ['ab','ab','ab'])
Expected Output: ['ab']
Explanation: Duplicate words collapse in the trie and are reported once.
Hints
- Re-running the Part 1 DFS once per word re-walks shared prefixes. Index the dictionary into a structure that lets one board traversal match many words at once.
- Build a trie from words, then DFS the board once advancing a trie pointer in lockstep with the path: only recurse into a neighbor when the trie has a child for that letter. A terminal node means a full word matched.
- Store the full word at the trie's terminal node and add it on a hit. Then (1) clear/pop the terminal after collecting so it isn't reported twice, and (2) prune leaf trie nodes with no children after a match so dead branches stop being explored.