Design a snapshotable key-value store
Company: Apple
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Design a snapshotable key-value store states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.
Constraints
- 0 <= number of operations <= 10^5
- Keys and values are non-empty strings (except absent reads, which return null/None).
- snapshotId values are exactly the strings returned by prior Snapshot ops: "snap-0", "snap-1", ...
- A GS with an unknown snapshotId returns null/None.
- Reads of a key that was never set (or not yet set at the snapshot's time) return null/None.
Examples
Input: ([["S", "a", "a-foo"], ["S", "b", "b-foo"], ["Snapshot"], ["S", "a", "a-foo-prime"], ["G", "a"], ["GS", "snap-0", "a"], ["G", "b"], ["GS", "snap-0", "b"]],)
Expected Output: ["snap-0", "a-foo-prime", "a-foo", "b-foo", "b-foo"]
Explanation: The prompt's example. Snapshot returns 'snap-0'. After overwriting 'a', G("a")='a-foo-prime' (current) but GS('snap-0','a')='a-foo' (frozen at the snapshot). 'b' was never overwritten, so both G and GS return 'b-foo'.
Input: ([["G", "x"]],)
Expected Output: [null]
Explanation: Reading a key that was never set returns null/None.
Input: ([],)
Expected Output: []
Explanation: No operations -> no outputs.
Input: ([["S", "k", "v1"], ["S", "k", "v2"], ["Snapshot"], ["S", "k", "v3"], ["S", "k", "v4"], ["Snapshot"], ["GS", "snap-0", "k"], ["GS", "snap-1", "k"], ["G", "k"]],)
Expected Output: ["snap-0", "snap-1", "v2", "v4", "v4"]
Explanation: Consecutive writes collapse: snap-0 captures only the last value before it ('v2'), snap-1 captures 'v4'. The current value is 'v4'.
Input: ([["Snapshot"], ["Snapshot"], ["GS", "snap-0", "missing"], ["GS", "snap-1", "missing"], ["GS", "bad-id", "missing"]],)
Expected Output: ["snap-0", "snap-1", null, null, null]
Explanation: Two consecutive snapshots with no writes capture the same (empty) state. A never-set key returns null, and an unknown snapshotId ('bad-id') also returns null.
Input: ([["S", "a", "1"], ["Snapshot"], ["S", "a", "2"], ["Snapshot"], ["GS", "snap-0", "a"], ["GS", "snap-1", "a"], ["S", "a", "3"], ["G", "a"], ["GS", "snap-1", "a"]],)
Expected Output: ["snap-0", "snap-1", "1", "2", "3", "2"]
Explanation: Snapshots isolate history: snap-0='1', snap-1='2'. A later write of '3' changes the current value (G='3') but leaves snap-1 untouched (still '2').
Input: ([["S", "only", "val"], ["Snapshot"], ["GS", "snap-0", "only"], ["GS", "snap-0", "nope"], ["G", "only"]],)
Expected Output: ["snap-0", "val", null, "val"]
Explanation: GS for an existing key returns its snapshot value ('val'); GS for a key absent at the snapshot returns null; the current G still returns 'val'.
Hints
- Don't deep-copy the whole map on every Snapshot - that is O(keys) space per snapshot. Keep history per key instead (copy-on-write).
- Maintain a global 'era' counter that increments on each Snapshot. A write tags its value with the current era; a read 'as of snapshot s' wants the latest value whose era is <= the era when s was taken.
- Store each key's changes as a sorted list of (era, value). Binary search (bisect_right - 1) gives the value visible at any era in O(log u).
- Map each returned snapshotId to the era value it captured, so GS can translate the opaque id back into an era before searching.