String Compression and Decompression (Run-Length Encoding)
Problem
Design an algorithm that compresses a string by replacing consecutive repeating characters with the character followed by the repeat count. For example:
-
Input: "aaabbc"
-
Output: "a3b2c1"
Describe both the compression and decompression procedures.
Requirements
-
Define the compression algorithm (encode) and the decompression algorithm (decode).
-
Address edge cases:
-
Single-character runs (e.g., "c" → "c1").
-
Very long inputs and runs (e.g., millions of characters).
-
Cases where compression does not help or makes the string longer.
-
Provide Big-O time and space complexity analysis.
Assumptions (make minimal, explicit choices)
-
Use a simple run-length encoding (RLE) scheme: for each run, emit a single character followed by its run length as a base-10 positive integer (no separators). Always include the count (even when it is 1), consistent with the example.
-
Characters are Unicode code points; counts are ASCII digits 0–9.
-
Input strings can include any characters, including digits; the format remains unambiguous because decoding reads exactly one character followed by one-or-more digits representing the count.