Design time-versioned KV without timestamp argument
Company: Uber
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates understanding of time-versioned key-value stores, monotonic timestamp assignment, clock skew handling, concurrent writes, appropriate data structures, and time/space complexity analysis.
Constraints
- 0 <= len(operations) <= 200000
- key and value are non-empty strings; value is never the empty string
- clockTime and wallTime fit in a signed 64-bit integer
- The system clock readings on set operations may be equal to or less than earlier readings
- Only set operations create new versions; get operations do not change the store timestamp
Examples
Input: ([['set', 'a', 'v1', '10'], ['getLatest', 'a'], ['getAtOrBefore', 'a', '10'], ['set', 'a', 'v2', '20'], ['getAtOrBefore', 'a', '15'], ['getAtOrBefore', 'a', '20'], ['getLatest', 'a']],)
Expected Output: ['v1', 'v1', 'v1', 'v2', 'v2']
Explanation: The writes receive timestamps (10, 0) and (20, 0). Queries at wall time 15 still see v1, while wall time 20 sees v2.
Input: ([['set', 'k', 'a', '100'], ['set', 'k', 'b', '100'], ['set', 'k', 'c', '99'], ['getAtOrBefore', 'k', '99'], ['getAtOrBefore', 'k', '100'], ['getAtOrBefore', 'k', '101'], ['getLatest', 'k']],)
Expected Output: ['', 'c', 'c', 'c']
Explanation: The clock repeats and then moves backwards, so assigned timestamps are (100, 0), (100, 1), and (100, 2). Nothing exists at wall time 99, and c is the latest version at wall time 100 or later.
Input: ([['set', 'a', 'x', '5'], ['set', 'b', 'y', '4'], ['getAtOrBefore', 'b', '4'], ['getAtOrBefore', 'b', '5'], ['set', 'a', 'z', '6'], ['getAtOrBefore', 'a', '5'], ['getAtOrBefore', 'a', '6']],)
Expected Output: ['', 'y', 'x', 'z']
Explanation: The second write observes clock time 4, but the previous assigned wall component is 5, so b=y receives timestamp (5, 1). Therefore b is not visible at wall time 4 but is visible at wall time 5.
Input: ([],)
Expected Output: []
Explanation: There are no operations, so there are no get results.
Input: ([['getLatest', 'missing'], ['getAtOrBefore', 'missing', '0'], ['set', 'n', 'old', '-5'], ['getAtOrBefore', 'n', '-6'], ['getAtOrBefore', 'n', '-5'], ['set', 'n', 'newer', '-10'], ['getAtOrBefore', 'n', '-5'], ['getLatest', 'n']],)
Expected Output: ['', '', '', 'old', 'newer', 'newer']
Explanation: Missing keys return ''. The first negative-time write gets (-5, 0). The later write observes -10, so it gets (-5, 1) and becomes the current value at wall time -5.
Solution
def solution(operations):
from bisect import bisect_right
histories = {}
outputs = []
last_wall = None
last_counter = 0
inf_counter = len(operations) + 1
for op in operations:
kind = op[0]
if kind == 'set':
key = op[1]
value = op[2]
clock_time = int(op[3])
if last_wall is None or clock_time > last_wall:
wall = clock_time
counter = 0
else:
wall = last_wall
counter = last_counter + 1
last_wall = wall
last_counter = counter
if key not in histories:
histories[key] = ([], [])
stamps, values = histories[key]
stamps.append((wall, counter))
values.append(value)
elif kind == 'getLatest':
key = op[1]
if key in histories and histories[key][1]:
outputs.append(histories[key][1][-1])
else:
outputs.append('')
elif kind == 'getAtOrBefore':
key = op[1]
wall_time = int(op[2])
if key not in histories:
outputs.append('')
continue
stamps, values = histories[key]
idx = bisect_right(stamps, (wall_time, inf_counter)) - 1
if idx >= 0:
outputs.append(values[idx])
else:
outputs.append('')
else:
raise ValueError('unknown operation type')
return outputsTime complexity: Per operation: set is O(1), getLatest is O(1), and getAtOrBefore is O(log m), where m is the number of versions for that key. Over n operations, the worst case is O(n log n).. Space complexity: O(w), where w is the number of set operations stored as historical versions..
Hints
- Keep every key's versions in timestamp order. Since global assigned timestamps are increasing as operations are processed, each key's history can be appended to directly.
- To handle repeated or backwards clock readings, combine the wall-clock component with a logical counter, then binary search for the upper bound (wallTime, infinity).