Group Anagrams
Company: Uber
Role: Software Engineer
Category: Coding & Algorithms
Interview Round: Onsite
# Group Anagrams
Given an array of strings `words`, group together all strings that are **anagrams** of one another and return the groups.
Two strings are anagrams if one can be obtained by rearranging the letters of the other — that is, they contain exactly the same multiset of characters (same letters with the same frequencies). Each input string consists of lowercase English letters and may be empty.
Return a list of groups, where each group is the list of strings that are mutual anagrams. The empty string `""` forms its own anagram group. The order of the groups in the output, and the order of strings within each group, does **not** matter.
### Input / Output
- Input: an array of strings `words`.
- Output: a list of lists of strings, partitioning `words` into anagram groups. Every input string appears in exactly one group (duplicates that appear multiple times in the input should appear the same number of times across the output).
### Example 1
```
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
```
(Any ordering of the groups and of the members within a group is accepted.)
### Example 2
```
Input: [""]
Output: [[""]]
```
### Example 3
```
Input: ["a"]
Output: [["a"]]
```
### Constraints
- `1 <= words.length <= 10^4`
- `0 <= words[i].length <= 100`
- `words[i]` consists of lowercase English letters only.
### Follow-up
- State and justify the time and space complexity of your approach.
- Explain how you would adapt the solution if the input did not fit in memory (e.g., billions of strings streamed from disk), and what the dominant cost becomes when the strings are very long.
Quick Answer: This question evaluates a candidate's ability to design an efficient grouping algorithm using hashing and string canonicalization, a staple category in coding interviews. It tests practical application of data structures for classifying items by a normalized key, along with the ability to reason about time and space complexity at scale.
Given an array of strings `words`, group together all strings that are anagrams of one another and return the groups.
Two strings are anagrams if one can be obtained by rearranging the letters of the other (they contain exactly the same multiset of characters). Each input string consists of lowercase English letters and may be empty; the empty string `""` forms its own anagram group.
Return a list of groups, where each group is the list of strings that are mutual anagrams. Every input string appears in exactly one group, and duplicate input strings appear the same number of times across the output.
Note for this console: although any ordering of groups (and of members within a group) is a valid answer to the interview question, this executable challenge expects a **deterministic** ordering so results can be checked automatically — emit the groups in order of first appearance of their key, and keep the strings within each group in their original input order.
Example:
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Constraints
- 1 <= words.length <= 10^4
- 0 <= words[i].length <= 100
- words[i] consists of lowercase English letters only.
- The empty string "" is a valid input and forms its own anagram group.
Examples
Input: (["eat", "tea", "tan", "ate", "nat", "bat"],)
Expected Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Explanation: 'eat','tea','ate' share the key 'aet'; 'tan','nat' share 'ant'; 'bat' is alone. Groups appear in first-appearance order.
Input: ([""],)
Expected Output: [[""]]
Explanation: A single empty string forms its own group.
Input: (["a"],)
Expected Output: [["a"]]
Explanation: A single-character string forms a group of one.
Input: (["abc", "bca", "cab"],)
Expected Output: [["abc", "bca", "cab"]]
Explanation: All three are anagrams (key 'abc'), so they land in one group.
Input: (["a", "a"],)
Expected Output: [["a", "a"]]
Explanation: Duplicate strings are anagrams of each other and both appear in the same group.
Input: (["", "a", ""],)
Expected Output: [["", ""], ["a"]]
Explanation: The two empty strings group together (key ''); 'a' forms its own group. The empty-string group appears first because '' was seen first.
Input: (["ab", "ba", "abc", "cba", "x"],)
Expected Output: [["ab", "ba"], ["abc", "cba"], ["x"]]
Explanation: 'ab'/'ba' (key 'ab'), 'abc'/'cba' (key 'abc'), and 'x' form three groups in first-appearance order.
Hints
- Two strings are anagrams iff their sorted character sequences are equal — so the sorted string can serve as a group key.
- Use a hash map from the canonical key (sorted string, or a 26-length character count) to the list of original strings that map to it.
- To get a deterministic output, rely on insertion order: append each word to its key's bucket and return the buckets in the order their keys were first seen.
- For very long strings, a character-frequency key (O(K)) avoids the O(K log K) sort cost per string.