Group anagrams and count string in grid
Company: J.P. Morgan
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
Quick Answer: This question evaluates algorithmic problem-solving skills in string processing, hashing/grouping and grid traversal/backtracking, along with the ability to analyze time and space complexity and trade-offs.
Part 1: Group Anagrams
Constraints
- 0 <= len(words) <= 10^4
- 0 <= len(words[i]) <= 100
- Each word contains only lowercase English letters
- Duplicate strings are allowed
Examples
Input: (['eat', 'tea', 'tan', 'ate', 'nat', 'bat'],)
Expected Output: ([['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']], 3)
Explanation: 'eat', 'tea', and 'ate' are anagrams; 'tan' and 'nat' are anagrams; 'bat' stands alone. Group order follows first appearance in the input.
Input: ([],)
Expected Output: ([], 0)
Explanation: An empty input produces no groups.
Input: (['abc', 'bca', 'xyz', 'cab', 'zyx', 'foo'],)
Expected Output: ([['abc', 'bca', 'cab'], ['xyz', 'zyx'], ['foo']], 3)
Explanation: The signatures for 'abc', 'xyz', and 'foo' are first seen in that order, so their groups appear in that order.
Input: (['', '', 'a'],)
Expected Output: ([['', ''], ['a']], 2)
Explanation: Empty strings are anagrams of each other because they have the same letter frequencies.
Input: (['aa', 'aa', 'ab', 'ba', 'aa'],)
Expected Output: ([['aa', 'aa', 'aa'], ['ab', 'ba']], 2)
Explanation: Duplicate words stay in their original order inside the same anagram group.
Solution
def solution(words):
groups = []
key_to_index = {}
for word in words:
counts = [0] * 26
for ch in word:
counts[ord(ch) - ord('a')] += 1
key = tuple(counts)
if key not in key_to_index:
key_to_index[key] = len(groups)
groups.append([])
groups[key_to_index[key]].append(word)
return (groups, len(groups))Time complexity: O(n * k), where n is the number of words and k is the average word length. Space complexity: O(n * k) for storing the output and signatures.
Hints
- Two words belong in the same group if they have the same character-frequency signature.
- If the result order must be deterministic, store groups in the order each signature first appears.
Part 2: Count Target in Grid
Constraints
- The grid is rectangular
- 0 <= number of rows, number of columns <= 6
- 1 <= len(target) <= 12
- Each grid cell and each character in target is an uppercase English letter
Examples
Input: (['AB', 'CA'], 'ABA')
Expected Output: 2
Explanation: There are exactly two valid paths: (0,0)->(0,1)->(1,1) and the reverse path (1,1)->(0,1)->(0,0).
Input: (['AA', 'AA'], 'AA')
Expected Output: 8
Explanation: Each of the 4 cells can start a path, and each start has 2 orthogonal neighbors, giving 4 * 2 = 8 paths.
Input: (['A'], 'A')
Expected Output: 1
Explanation: The single cell matches the target, so there is exactly one path.
Input: (['A'], 'AA')
Expected Output: -1
Explanation: The target needs two cells, but the grid has only one and cells cannot be reused.
Input: ([], 'A')
Expected Output: -1
Explanation: An empty grid cannot form any non-empty target.
Input: (['ABA', 'BAB'], 'ABA')
Expected Output: 10
Explanation: There are 10 distinct coordinate paths that spell ABA when moving only orthogonally without reusing cells.
Solution
def solution(grid, target):
if not grid or not target:
return -1
rows = len(grid)
cols = len(grid[0]) if rows > 0 else 0
if cols == 0:
return -1
if len(target) > rows * cols:
return -1
from collections import Counter
grid_counts = Counter(ch for row in grid for ch in row)
target_counts = Counter(target)
for ch, needed in target_counts.items():
if grid_counts.get(ch, 0) < needed:
return -1
visited = [[False] * cols for _ in range(rows)]
directions = [(1, 0), (-1, 0), (0, 1), (0, -1)]
def dfs(r, c, idx):
if idx == len(target) - 1:
return 1
visited[r][c] = True
total = 0
next_char = target[idx + 1]
for dr, dc in directions:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols and not visited[nr][nc] and grid[nr][nc] == next_char:
total += dfs(nr, nc, idx + 1)
visited[r][c] = False
return total
total_paths = 0
first_char = target[0]
for r in range(rows):
for c in range(cols):
if grid[r][c] == first_char:
total_paths += dfs(r, c, 0)
return total_paths if total_paths > 0 else -1
Time complexity: O(R * C * 4^L) in the worst case, where R is the number of rows, C is the number of columns, and L is the length of the target. Space complexity: O(R * C).
Hints
- Use DFS/backtracking starting only from cells that match the first character of the target.
- Keep a visited structure so a path never reuses a cell, and prune early if the grid does not contain enough copies of some character in the target.