Implement persistent key-value store
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates competency in implementing an in-memory key-value store with persistence, covering concepts such as serialization, durability, data structure management, and basic I/O handling.
Constraints
- 0 <= len(operations) <= 100000
- Keys and values are strings and may be empty
- The total UTF-8 byte length of all keys and values involved in the processed snapshots is at most 10^6
- Average O(1) lookup/update is expected for set/get
Examples
Input: [('set', 'a', '1'), ('set', 'b', '2'), ('get', 'a'), ('shutdown',), ('get', 'a'), ('restore',), ('get', 'b'), ('get', 'a')]
Expected Output: ['1', None, '2', '1']
Explanation: After shutdown the in-memory store is cleared, so 'a' is missing until restore reloads the saved snapshot.
Input: [('set', 'x', 'old'), ('set', 'x', 'new'), ('shutdown',), ('restore',), ('get', 'x'), ('set', 'y', '3'), ('shutdown',), ('get', 'y'), ('restore',), ('get', 'y'), ('get', 'x')]
Expected Output: ['new', None, '3', 'new']
Explanation: The second set overwrites 'x'. The second shutdown saves a new snapshot containing both 'x' and 'y'.
Input: [('restore',), ('get', 'missing'), ('set', '', ''), ('set', 'a|b:c', 'v1|v2:ok'), ('shutdown',), ('restore',), ('get', ''), ('get', 'a|b:c'), ('get', 'missing')]
Expected Output: [None, '', 'v1|v2:ok', None]
Explanation: This checks restore before any shutdown, empty strings, and keys/values containing characters that would break delimiter-based serialization.
Input: [('shutdown',), ('restore',), ('get', 'a')]
Expected Output: [None]
Explanation: Shutting down an empty store should still produce a valid empty snapshot that restores to an empty map.
Solution
def solution(operations):
def write_int(n):
return bytes([(n >> 24) & 255, (n >> 16) & 255, (n >> 8) & 255, n & 255])
def read_int(data, index):
value = (data[index] << 24) | (data[index + 1] << 16) | (data[index + 2] << 8) | data[index + 3]
return value, index + 4
def serialize(store):
blob = bytearray()
blob.extend(write_int(len(store)))
for key, value in store.items():
key_bytes = key.encode('utf-8')
value_bytes = value.encode('utf-8')
blob.extend(write_int(len(key_bytes)))
blob.extend(key_bytes)
blob.extend(write_int(len(value_bytes)))
blob.extend(value_bytes)
return bytes(blob)
def deserialize(blob):
if blob is None:
return {}
count, index = read_int(blob, 0)
restored = {}
for _ in range(count):
key_len, index = read_int(blob, index)
key = blob[index:index + key_len].decode('utf-8')
index += key_len
value_len, index = read_int(blob, index)
value = blob[index:index + value_len].decode('utf-8')
index += value_len
restored[key] = value
return restored
store = {}
medium = None
result = []
for operation in operations:
command = operation[0]
if command == 'set':
_, key, value = operation
store[key] = value
elif command == 'get':
_, key = operation
result.append(store.get(key))
elif command == 'shutdown':
medium = serialize(store)
store = {}
elif command == 'restore':
store = deserialize(medium)
return resultTime complexity: O(q + B), where q is the number of operations and B is the total number of bytes serialized/deserialized across all shutdown and restore calls. Space complexity: O(S), where S is the size of the current in-memory store plus the latest saved snapshot.
Hints
- A plain delimiter like ':' is unsafe because keys or values can contain it. Prefix each encoded string with its byte length instead.
- Treat shutdown as writing a full snapshot and clearing memory. restore should replace the current in-memory map with the last saved snapshot, not merge with it.