This question evaluates understanding of data compression algorithms, bit-level manipulation, two's‑complement integer representation, streaming algorithm design, and encoder/decoder state management for 32-bit signed integers.
You are implementing a simple compression scheme for sequences of 32‑bit signed integers. The codec should support two encoding strategies:
The codec is streaming: values arrive one by one, and the encoder should buffer them into blocks and choose an encoding strategy per block.
Design a compressed representation made of blocks. For concreteness, define:
type = 'R'
value
(int32)
count
(int, number of repetitions)
count >= RLE_MIN_RUN
(e.g.,
RLE_MIN_RUN = 3
). Shorter runs should not be encoded as RLE.
count
possibly different integers, all of which fit into
bitWidth
bits in two's‑complement representation.
type = 'B'
bitWidth
(1–32)
count
(number of values)
count
values, each stored using
bitWidth
bits.
MAX_BP_BLOCK
(e.g., 128 values) for simplicity.
You can decide the in‑memory representation of a bit‑packed payload (e.g., array of 32‑bit integers where bits are tightly packed), as long as the decoder can reconstruct the original sequence exactly.
Implement an Encoder class with the following behavior:
void add(int value)
flush()
is called), it should
emit blocks
.
List<Block> flush()
that finalizes the stream, closes any open block, and returns the list of compressed blocks.
Encoding strategy constraints:
L
:
L >= RLE_MIN_RUN
, you should encode it as a single RLE block.
MAX_BP_BLOCK
.
bitWidth
for a block as the minimum number of bits needed to represent
all
its values.
The encoder must correctly handle:
Integer.MIN_VALUE
to
Integer.MAX_VALUE
).
Implement a corresponding Decoder that reconstructs the original integer sequence from a list of blocks.
class Decoder implements Iterator<Integer> {
Decoder(List<Block> blocks) { ... }
boolean hasNext();
int next();
}
hasNext()
/
next()
should expose the
original sequence of integers
in order, exactly as they were passed to
Encoder.add(...)
.
Block
and the bit‑packed payload.
Encoder.add(value)
and
Encoder.flush()
to produce a valid block sequence.
Decoder
as an iterator over the decompressed values.
Integer.MAX_VALUE
,
Integer.MIN_VALUE
, and negative numbers.
Assume you are working in an object‑oriented language (e.g., Java, C++, or similar) and focus on clean class design and correctness of the codec pair.