Decode and Explain Ambiguity in Compression Strings
Company: Pinterest
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates parsing of compact count-value encodings, reasoning about ambiguous encodings, and combinatorial enumeration of all valid decodings.
Constraints
- 1 <= len(s) <= 30
- s contains only digits '0'..'9'
- Each pair is: count (positive integer, no leading zeros) followed by value (single digit 0..9)
- Return all expanded arrays sorted lexicographically (standard list ordering)
- If no valid partition exists, return []
- In test data, the total number of integers across all returned arrays will not exceed 1e5
Examples
Input: 12
Expected Output:
Input: 20
Expected Output:
Solution
from typing import List, Tuple
from functools import lru_cache
def decode_all(s: str) -> List[List[int]]:
n = len(s)
@lru_cache(maxsize=None)
def dfs(i: int):
# Returns a list of tuples, where each tuple is an expanded decoding of s[i:]
if i == n:
return [()] # one valid way: empty tail
if i > n - 2:
return [] # not enough digits left to form a pair (count + value)
if s[i] == '0':
return [] # count cannot start with '0'
res = []
# j is the index of the value digit; count is s[i:j]
for j in range(i + 1, n):
count_str = s[i:j]
if count_str[0] == '0':
break # no leading zeros; longer counts will also start with '0'
count = int(count_str)
value = int(s[j])
for tail in dfs(j + 1):
res.append((value,) * count + tail)
return res
expansions = dfs(0)
unique = sorted(set(expansions)) # dedupe and sort lexicographically
return [list(t) for t in unique]
Explanation
Time complexity: O(U + R), where U is the number of DFS states explored (at most O(n^2)) and R is the total number of integers produced across all returned arrays; generating the expansions dominates.. Space complexity: O(M + R), where M is memoization overhead (number of dfs states) and R is the total size of all returned arrays..
Hints
- Use DFS/backtracking over the index. At each index, try all possible count lengths (1..k) as long as at least one digit remains for the value.
- Disallow counts that start with '0'.
- Memoize by index to avoid recomputing decodings for the same suffix.
- Combine the current (count,value) expansion with all decodings of the remaining suffix.
- Sort the final results lexicographically for deterministic output.