Implement run-length encoding and decoding
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
You are given a string consisting of lowercase English letters. You need to implement run-length encoding (RLE) and its corresponding decoding.
1. **Encoding**
- Input: a string `s`, e.g., `"aaabccccd"`.
- Output: an encoded string where each maximal group of consecutive identical characters is replaced by the character followed by the count of repetitions, e.g., `"a3b1c4d1"`.
2. **Decoding**
- Input: an RLE-encoded string as described above, which is always valid (i.e., alternates between characters and positive integer counts), e.g., `"a3b1c4d1"`.
- Output: the original string, e.g., `"aaabccccd"`.
Constraints:
- `0 <= len(s) <= 10^5` for both encoding and decoding inputs.
- Counts in the encoded string fit in a 32-bit signed integer.
Tasks:
- Describe how to implement `encode(String s)` and `decode(String encoded)`.
- Explain the time and space complexity of each function.
- Consider edge cases such as empty strings and runs of length 1.
Quick Answer: This question evaluates proficiency in string manipulation and encoding algorithms, specifically run-length encoding and decoding, along with attention to edge cases such as empty inputs and single-character runs and correct handling of repetition counts.
Part 1: Implement Run-Length Encoding
Given a string `s` containing only lowercase English letters, compress it using run-length encoding. Replace each maximal group of consecutive identical characters with the character followed by the number of times it appears consecutively.
For example, `aaabccccd` becomes `a3b1c4d1`.
If the input string is empty, return an empty string.
Constraints
- 0 <= len(s) <= 10^5
- `s` contains only lowercase English letters
- Run counts fit in a 32-bit signed integer
Examples
Input: 'aaabccccd'
Expected Output: 'a3b1c4d1'
Explanation: The runs are `aaa`, `b`, `cccc`, and `d`.
Input: ''
Expected Output: ''
Explanation: An empty string encodes to an empty string.
Input: 'a'
Expected Output: 'a1'
Explanation: A single character forms one run of length 1.
Input: 'abc'
Expected Output: 'a1b1c1'
Explanation: Each character is different, so each run has length 1.