Implement a binary encoding and decoding scheme for strings based on character frequencies (Huffman-like coding).
You must implement two functions:
encode(s)
→ returns
(bitstring, tree)
decode(bitstring, tree)
→ returns the original string
Given an input string s:
0
1
bitstring
is the concatenation of the codes for each character in
s
.
To make the encoding deterministic when multiple nodes have equal weight, use this tie-break rule:
(weight, key)
.
key = character
(lexicographic order).
key = the smallest character contained in its subtree
.
(weight, key)
becomes the
left
child (gets edge
0
).
Given (bitstring, tree), decode by traversing the tree from the root:
0
means go to the left child; bit
1
means go to the right child.
s
consists of arbitrary ASCII characters.
'0'
and
'1'
.
s
is empty, encoding returns
("", null)
and decoding returns
""
.
s
contains only one distinct character (e.g.,
"aaaa"
), assign that character the code
"0"
(so encoding is
"0000"
).
0 ≤ |s| ≤ 2 * 10^5
σ ≤ 256
O(|s| + σ log σ)
time and
O(σ)
extra space (excluding output).