Design and implement an in-memory key-value store with a 'medium' layer that serializes the store to bytes for persistence. Provide four functions: set(key, value), get(key), shutdown(), and restore(). On shutdown(), flush all data to the medium; on restore(), rebuild state from the medium. Specify:
(a) the medium interface for writing/reading bytes,
(b) the serialization format and handling of arbitrary values,
(c) error handling and data integrity (atomic flush, crash consistency),
(d) complexity and performance considerations, and
(e) minimal tests and example usage.
Quick Answer: This question evaluates a candidate's ability to design persistent storage and in-memory data structures, covering serialization of arbitrary values, medium/interface design for byte-level persistence, crash-consistent flushing, error handling, and performance and complexity analysis.
Implement a **crash-consistent, in-memory key-value store** that can persist its full state to an append-only byte medium and recover from the most recent valid snapshot — even after an interrupted write.
You are given a list of **operations** to apply in order. Process them one at a time and return a list containing the result of each operation (in the same order).
## Function
```python
def solution(operations):
...
```
- **`operations`** — a list of tuples, each describing one command (see below).
- **Returns** — a list with one entry per operation, holding that operation's result.
## Operations
Each element of `operations` is a tuple whose first item is the command name:
| Operation | Behavior | Result appended |
|-----------|----------|-----------------|
| `('set', key, value)` | Store or overwrite `value` under `key` in memory. | `None` |
| `('get', key)` | Return the current in-memory value for `key`. | the stored value, or `None` if the key is missing |
| `('shutdown',)` | Flush the **entire** current store to the medium as one complete snapshot frame. | `True` |
| `('restore',)` | Rebuild in-memory state from the medium's most recent valid snapshot. | `True` if a valid snapshot was loaded, otherwise `False` |
| `('crash',)` | Judge-only crash simulation (see below). | `None` |
- **Keys** are strings.
- **Values** are JSON-serializable Python literals: `None`, `bool`, `int`, `float`, `str`, `list`, and `dict` with string keys.
## The medium
The medium is an internal **append-only byte buffer** owned by your solution. It behaves like:
- `append(bytes)` — add bytes to the end.
- `read_all() -> bytes` — return everything written so far.
Earlier bytes are never modified or removed; new frames are only appended.
## Snapshot frame format
Each `shutdown` writes exactly **one complete frame** that captures the whole store at that moment. A frame is the concatenation of:
1. **MAGIC** — 4 bytes, always `b'KVS1'`.
2. **LEN** — 12 ASCII digits, zero-padded, giving the payload's byte length.
3. **CHECKSUM** — 64 lowercase hex characters, the SHA-256 of the payload.
4. **PAYLOAD** — the UTF-8 bytes of:
```python
json.dumps(store, sort_keys=True, separators=(',', ':'), ensure_ascii=False)
```
The header (MAGIC + LEN + CHECKSUM) is a fixed **80 bytes**, followed by the variable-length payload.
## Crash consistency
- `shutdown` is **atomic at the frame level**: a fully written frame is valid, and any partially written trailing frame must never corrupt previously persisted data.
- `restore` scans the medium **left to right from the beginning**, validating each frame in turn:
- Verify the **MAGIC** matches.
- Verify **LEN** is all digits and the declared payload does not run past the end of the buffer.
- Verify the recomputed SHA-256 of the payload equals the stored **CHECKSUM**.
- Verify the payload decodes as UTF-8 and parses to a JSON object (`dict`).
- Scanning stops at the **first** frame that fails any check (e.g. a truncated tail or a bad checksum); that invalid tail is ignored.
- `restore` then loads the **last valid snapshot** seen before the stop. If none was valid, it **clears memory** and returns `False`.
Because the medium is append-only and only a `crash` produces a malformed (always trailing) frame, every earlier frame stays intact, and the scan lands on the most recent good snapshot.
## Crash simulation (`crash`)
The `('crash',)` operation simulates an interrupted `shutdown`: it builds the frame for the **current** in-memory store but appends only the **first half** of that frame's bytes (`frame[:len(frame) // 2]`), leaving a truncated trailing frame. It returns `None`. This command is not part of the public API but your processor must support it for the tests.
## Examples
- `[('set', 'a', 1), ('get', 'a'), ('shutdown',), ('set', 'a', 2), ('get', 'a'), ('restore',), ('get', 'a')]`
→ `[None, 1, True, None, 2, True, 1]`
The `set 'a' 2` is overwritten in memory, but `restore` reloads the snapshot where `a == 1`.
- `[('set', 'x', 10), ('shutdown',), ('set', 'x', 99), ('crash',), ('set', 'y', 5), ('restore',), ('get', 'x'), ('get', 'y')]`
→ `[None, True, None, None, None, True, 10, None]`
The `crash` writes a truncated frame; `restore` skips it and recovers the earlier good snapshot (`x == 10`, no `y`).
- `[('restore',), ('get', 'missing')]` → `[False, None]`
No snapshot exists, so `restore` returns `False` and clears memory.
## Constraints
- `0 <= len(operations) <= 10^4`
- All keys are strings with length between 1 and 100.
- All values are JSON-serializable Python literals.
- Total serialized bytes written across all `shutdown`/`crash` operations is at most `10^6`.
Constraints
- 0 <= len(operations) <= 10^4
- All keys are strings with length between 1 and 100
- All values are JSON-serializable Python literals
- Total serialized bytes written across all shutdown/crash operations is at most 10^6
Examples
Input: [('set', 'a', 1), ('get', 'a'), ('shutdown',), ('set', 'a', 2), ('get', 'a'), ('restore',), ('get', 'a')]
Expected Output: [None, 1, True, None, 2, True, 1]
Explanation: The value 1 is persisted on shutdown. The later in-memory change to 2 is lost after restore because it was never flushed.
Input: [('set', 'user', {'name': 'Ada', 'tags': ['math', True], 'meta': {'age': 36}}), ('shutdown',), ('set', 'user', {'name': 'Grace'}), ('shutdown',), ('restore',), ('get', 'user')]
Expected Output: [None, True, None, True, True, {'name': 'Grace'}]
Explanation: Two valid snapshots are written. restore() loads the most recent valid one.
Input: [('set', 'x', 10), ('shutdown',), ('set', 'x', 99), ('crash',), ('set', 'y', 5), ('restore',), ('get', 'x'), ('get', 'y')]
Expected Output: [None, True, None, None, None, True, 10, None]
Explanation: The crash writes only half of the newer frame, so restore() ignores that invalid tail and falls back to the earlier valid snapshot where x = 10.
Input: [('restore',), ('get', 'missing')]
Expected Output: [False, None]
Explanation: With no valid snapshot in the medium, restore() returns False and the store remains empty.
Input: []
Expected Output: []
Explanation: No operations means no results.
Hints
- Use a self-describing frame: fixed magic bytes, a fixed-width length field, and a checksum before the payload.
- During restore, keep track of the last valid snapshot you parsed so that a truncated final write does not destroy previously persisted state.