Develop Auto-Complete System for Dish Suggestions
Company: Pinterest
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's competency in building efficient auto-complete systems, covering prefix-based search, string processing, ranking by popularity, and handling real-time updates and performance trade-offs.
Constraints
- 0 <= len(init) <= 50000
- 0 <= len(operations) <= 50000
- Dish names and prefixes use lowercase letters 'a'-'z' only
- 1 <= len(dish) <= 60
- 0 <= score <= 1e9
- For any query, 0 <= k <= 1000
- If fewer than k matches exist, return all matches
- Results are ordered by score descending, then name ascending
- Updates take effect immediately for subsequent queries
- If a dish appears multiple times in init, the last score wins
Solution
def autocomplete_system(init, operations):
class Node:
__slots__ = ("children", "end", "word")
def __init__(self):
self.children = {}
self.end = False
self.word = None
root = Node()
scores = {}
# Initialize scores (last occurrence wins)
for pair in init:
word, score = pair[0], int(pair[1])
scores[word] = score
def insert(word):
curr = root
for ch in word:
if ch not in curr.children:
curr.children[ch] = Node()
curr = curr.children[ch]
curr.end = True
curr.word = word
# Build trie from unique words
for w in scores.keys():
insert(w)
def find_node(prefix):
curr = root
for ch in prefix:
if ch not in curr.children:
return None
curr = curr.children[ch]
return curr
def collect_words(node):
res = []
stack = [node]
while stack:
nd = stack.pop()
if nd.end and nd.word in scores:
res.append(nd.word)
if nd.children:
for child in nd.children.values():
stack.append(child)
return res
outputs = []
for op in operations:
if not op:
continue
kind = op[0]
if kind == "add":
word = op[1]
score = int(op[2])
existed = word in scores
scores[word] = score
if not existed:
insert(word)
elif kind == "query":
prefix = op[1]
k = int(op[2])
if k <= 0:
outputs.append([])
continue
node = find_node(prefix)
if node is None:
outputs.append([])
continue
words = collect_words(node)
cands = [(scores[w], w) for w in words]
cands.sort(key=lambda x: (-x[0], x[1]))
outputs.append([w for _, w in cands[:k]])
else:
# Unknown operation type: ignore
pass
return outputs
Explanation
Time complexity: Initialization: O(S) where S is total length of unique dish names. add: O(L) for new dish (L = length), O(1) for score update of existing dish. query: O(P + M log M), where P is prefix length and M is the number of matching dishes under the prefix.. Space complexity: O(S + N), where S is total characters stored in the trie and N is the number of unique dishes for the score map..
Hints
- Build a trie of dish names; maintain a separate hash map from dish to score for O(1) score updates.
- For query(prefix, k), navigate to the trie node of the prefix, then traverse its subtree to find terminal words.
- Rank candidates by (-score, name) to achieve score-desc, name-asc ordering.
- To optimize, maintain a size-k min-heap during traversal to keep only the best k seen so far.
- Ensure updates only change the score map; insert into the trie only when a new dish name appears.