Compute shortest recursive string encoding
Company: Snapchat
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
Quick Answer: This question evaluates proficiency in string manipulation, pattern detection, recursion, and dynamic programming for efficient sequence compression in the Coding & Algorithms domain.
Constraints
- 1 <= len(s) <= 150 (typical interview / LC 471 bound)
- s consists of lowercase English letters only
- k in k{pattern} is always greater than 1; an encoding is used only when strictly shorter
- Nested encodings are permitted: the pattern inside braces may itself be encoded
Examples
Input: ("aaa",)
Expected Output: "aaa"
Explanation: '3{a}' is 4 chars, longer than the 3-char original, so the string is returned unchanged.
Input: ("aaaaa",)
Expected Output: "5{a}"
Explanation: Five 'a's compress to '5{a}' (4 chars < 5).
Input: ("aaaaaaaaaa",)
Expected Output: "10{a}"
Explanation: Ten 'a's compress to '10{a}' (5 chars). The full-string repetition is preferred over a length-tied split like 'a9{a}'.
Input: ("aabcaabcd",)
Expected Output: "2{aabc}d"
Explanation: 'aabc' repeats twice then a trailing 'd': '2{aabc}d' (8 chars < 9).
Input: ("abbbabbbcabbbabbbc",)
Expected Output: "2{2{abbb}c}"
Explanation: 'abbbabbbc' repeats twice; that unit itself is 'abbb' twice plus 'c' → '2{abbb}c', giving the nested '2{2{abbb}c}'.
Input: ("",)
Expected Output: ""
Explanation: Empty string edge case returns empty.
Input: ("a",)
Expected Output: "a"
Explanation: Single character cannot be shortened.
Input: ("abc",)
Expected Output: "abc"
Explanation: No repeated structure, so the original is returned unchanged.
Input: ("aaaa",)
Expected Output: "aaaa"
Explanation: Boundary case: '4{a}' is also 4 chars, not strictly shorter, so 'aaaa' is kept.
Input: ("bbbbbbbbbbbb",)
Expected Output: "12{b}"
Explanation: Twelve 'b's compress to '12{b}' (5 chars < 12); a two-digit count still encodes correctly.
Hints
- Think interval DP: let dp[i][j] be the shortest encoding of the substring s[i..j]. Combine answers from splitting at every interior point.
- To test whether a substring is a pure repetition of some unit, use the classic trick: a string t repeats iff t is a substring of (t + t) starting at an index in (0, len(t)). Or just check each divisor length of len(t).
- When a substring is a repetition k copies of `unit`, its candidate encoding is str(k) + '{' + encode(unit) + '}' — recurse on the unit so nested patterns are captured.
- Only replace with k{pattern} when it is STRICTLY shorter than the literal text; otherwise keep the literal (that is why 'aaaa' stays 'aaaa' but 'aaaaa' becomes '5{a}').
- Evaluate the full-string repetition candidate before the split candidates so that on length ties the cleaner full-string form (e.g. '10{a}') is preferred over a split like 'a9{a}'.