Implement minimal RLE encode/decode with max run K
Company: Liftoff
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
You are given a run-length encoding (RLE) format that compresses a string by grouping **consecutive repetitions of a short substring**.
## Encoding format
An encoded string is a concatenation of **tokens**. Each token has the form:
- `<count><pattern>`
- `count` is a positive integer written in base-10 (may be multiple digits, no `+` sign).
- `pattern` is a non-empty string of lowercase letters `a-z`.
- The decoded expansion of the token is `pattern` repeated `count` times.
Tokens are concatenated directly. Since digits appear only in `count` and `pattern` contains only letters, boundaries are determined by digit vs. letter transitions.
### Example
`"2aab3ab"` decodes to `"aab" + "aab" + "ab" + "ab" + "ab"` = `"aabaabababab"`.
## Task A — Decode
Implement:
- `decode(encoded: str) -> str`
Given a valid encoded string, return the decoded original string.
## Task B — Encode (optimized)
Implement:
- `encode(s: str, K: int) -> str`
You must produce an encoded string that decodes to exactly `s`, under the constraint:
- Every `pattern` used in any token must have length **≤ K**.
Among all valid encodings, return one with **minimum total encoded length** (fewest characters in the encoded string). If multiple shortest encodings exist, you may return any one.
### Example
For `s = "aabaabababab"`, `K = 3`, one shortest encoding is:
- `"2aab3ab"`
because it represents `2 * "aab"` followed by `3 * "ab"`, and each pattern length is ≤ 3.
## Notes / Constraints
- `s` contains only lowercase letters `a-z`.
- `1 ≤ K ≤ len(s)`.
- `count` can be larger than 9 (multi-digit).
- Your encoding must use only the token format above; i.e., you cannot emit raw uncounted literals.
Quick Answer: This question evaluates proficiency in string parsing and compression techniques, specifically run-length encoding/decoding, pattern detection, and algorithmic optimization for producing minimal-length encodings under a maximum pattern length constraint.
Decode RLE Tokens
Decode a valid string made of tokens <count><pattern>, where count is a positive integer and pattern is one or more lowercase letters up to the next digit or end of string.
Constraints
- encoded is valid
- patterns contain lowercase letters only
Examples
Input: ('2aab3ab',)
Expected Output: 'aabaabababab'
Input: ('10a',)
Expected Output: 'aaaaaaaaaa'
Input: ('1abc2d',)
Expected Output: 'abcdd'
Hints
- Digit runs start counts; letter runs form patterns.
Minimum-Length RLE Encoding with Max Pattern Length
Return a shortest encoded string that decodes to s, where every token pattern has length at most K. If several shortest encodings exist, return the lexicographically smallest encoded string.
Constraints
- s contains lowercase letters
- 1 <= K <= len(s)
Examples
Input: ('aabaabababab', 3)
Expected Output: '2aab3ab'
Input: ('aaaaa', 2)
Expected Output: '5a'
Input: ('abc', 1)
Expected Output: '1a1b1c'
Input: ('abababx', 2)
Expected Output: '3ab1x'
Hints
- Use dynamic programming over the start index.
- Try each pattern length and repeat count that matches the remaining suffix.