Build a time-based key-value store
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of time-versioned data structures and efficient temporal lookup within associative mappings, focusing on storing and retrieving multiple values per key across timestamps.
Constraints
- 0 <= len(ops) == len(keys) == len(values) == len(timestamps) <= 200000
- ops[i] is either "set" or "get"
- 1 <= timestamps[i] <= 10^9
- key and value are strings
- For any fixed key, timestamps for its set operations are strictly increasing
Examples
Input: (['set', 'get', 'get', 'set', 'get', 'get'], ['foo', 'foo', 'foo', 'foo', 'foo', 'foo'], ['bar', '', '', 'bar2', '', ''], [1, 1, 3, 4, 4, 5])
Expected Output: ['bar', 'bar', 'bar2', 'bar2']
Explanation: The key 'foo' has values 'bar' at time 1 and 'bar2' at time 4. Each get returns the most recent value at or before the requested timestamp.
Input: (['get', 'set', 'get', 'get'], ['a', 'a', 'a', 'b'], ['', 'x', '', ''], [1, 5, 4, 2])
Expected Output: ['', '', '']
Explanation: The first get asks for a key that does not exist yet. After setting 'a' at time 5, querying at time 4 still returns an empty string because no timestamp <= 4 exists. Key 'b' was never set.
Input: (['set', 'set', 'set', 'get', 'get', 'get', 'get'], ['love', 'love', 'hate', 'love', 'love', 'hate', 'hate'], ['high', 'low', 'hot', '', '', '', ''], [10, 20, 5, 15, 25, 5, 4])
Expected Output: ['high', 'low', 'hot', '']
Explanation: This checks multiple keys. 'love' returns 'high' at time 15 and 'low' at time 25. 'hate' returns 'hot' at time 5 and empty at time 4.
Input: (['set', 'set', 'get', 'get', 'get'], ['k', 'k', 'k', 'k', 'k'], ['v1', 'v2', '', '', ''], [2, 8, 1, 2, 7])
Expected Output: ['', 'v1', 'v1']
Explanation: This checks boundary behavior around exact and in-between timestamps. Before time 2 there is no value, at time 2 the answer is 'v1', and at time 7 the most recent earlier value is still 'v1'.
Input: ([], [], [], [])
Expected Output: []
Explanation: With no operations, there are no get results to return.
Solution
def solution(ops, keys, values, timestamps):
store = {}
result = []
for op, key, value, ts in zip(ops, keys, values, timestamps):
if op == 'set':
if key not in store:
store[key] = [[], []] # timestamps list, values list
store[key][0].append(ts)
store[key][1].append(value)
else: # get
if key not in store:
result.append('')
continue
ts_list, val_list = store[key]
lo, hi = 0, len(ts_list)
while lo < hi:
mid = (lo + hi) // 2
if ts_list[mid] <= ts:
lo = mid + 1
else:
hi = mid
idx = lo - 1
if idx >= 0:
result.append(val_list[idx])
else:
result.append('')
return resultTime complexity: O(n + g log m), where n is the number of operations, g is the number of get operations, and m is the number of stored timestamps for a queried key. Space complexity: O(s), where s is the number of set operations stored.
Hints
- Store all timestamps and values for each key together so you can search only within that key's history.
- Because timestamps for each key are already sorted, use binary search to find the rightmost timestamp less than or equal to the query time.