Implement pagination and a time-versioned key-value store
Company: Lyft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates proficiency in data structures, algorithms, and API design for stateful systems, focusing on cursor-based pagination stability and time-versioned key-value storage with efficient time-travel reads, and is categorized under coding & algorithms with overlaps into data structures and systems design.
Part 1: Transaction Pagination with Stable Cursor
Constraints
- 0 <= len(transactions) <= 200000
- 1 <= page_size <= 1000
- Each txn_id is unique
- created_at is an integer epoch timestamp
- cursor is either None or a valid cursor previously produced by the function
Examples
Input: ([('t1', 'u1', 50, 100), ('t3', 'u1', 20, 120), ('t2', 'u1', 70, 120), ('t4', 'u2', 10, 130)], 'u1', 2, None)
Expected Output: {'items': [('t2', 'u1', 70, 120), ('t3', 'u1', 20, 120)], 'next_cursor': '120|t3'}
Explanation: Transactions for u1 are ordered by created_at descending. t2 comes before t3 because they tie on timestamp and txn_id is the tie-breaker.
Input: ([('t1', 'u1', 50, 100), ('t3', 'u1', 20, 120), ('t2', 'u1', 70, 120), ('t4', 'u2', 10, 130), ('t0', 'u1', 99, 200)], 'u1', 2, '120|t3')
Expected Output: {'items': [('t1', 'u1', 50, 100)], 'next_cursor': None}
Explanation: A newer transaction t0 was inserted before the next call, but the cursor resumes after t3, so there is no duplicate and the remaining older item is returned.
Input: ([], 'u1', 3, None)
Expected Output: {'items': [], 'next_cursor': None}
Explanation: Edge case: empty transaction list.
Input: ([('x1', 'u9', 5, 7)], 'u9', 10, None)
Expected Output: {'items': [('x1', 'u9', 5, 7)], 'next_cursor': None}
Explanation: Edge case: a single matching record fits in one page.
Solution
def solution(transactions, user_id, page_size, cursor):
from bisect import bisect_right
if page_size <= 0:
return {'items': [], 'next_cursor': None}
user_txns = [txn for txn in transactions if txn[1] == user_id]
user_txns.sort(key=lambda txn: (-txn[3], txn[0]))
start = 0
if cursor is not None:
created_at_str, last_txn_id = cursor.split('|', 1)
cursor_key = (-int(created_at_str), last_txn_id)
keys = [(-txn[3], txn[0]) for txn in user_txns]
start = bisect_right(keys, cursor_key)
page = user_txns[start:start + page_size]
if not page or start + page_size >= len(user_txns):
next_cursor = None
else:
last = page[-1]
next_cursor = f'{last[3]}|{last[0]}'
return {'items': page, 'next_cursor': next_cursor}
Time complexity: O(m log m) per call, where m is the number of transactions belonging to the requested user. The bisect step is O(log m), but sorting dominates.. Space complexity: O(m) for the filtered user transactions and ordering keys..
Hints
- Offset pagination is fragile when new rows appear. Think about resuming from the last seen record instead.
- Your ordering key is composite: created_at descending, then txn_id ascending. The cursor must carry enough information to resume after exactly one record.
Part 2: Time-Versioned Key-Value Store
Constraints
- 0 <= len(operations) <= 200000
- Keys and values are strings
- Timestamps are integers and are not guaranteed to be sorted
- If the same key is written at the same timestamp more than once, the most recent write wins
Examples
Input: ([('set', 'a', 'x', 5), ('set', 'a', 'y', 10), ('get', 'a', 4), ('get', 'a', 5), ('get', 'a', 7), ('get', 'a', 10), ('get', 'a', 100)],)
Expected Output: [None, 'x', 'x', 'y', 'y']
Explanation: Basic predecessor queries on one key.
Input: ([('set', 'k', 'late', 10), ('set', 'k', 'early', 3), ('set', 'k', 'replace', 10), ('get', 'k', 2), ('get', 'k', 3), ('get', 'k', 9), ('get', 'k', 10)],)
Expected Output: [None, 'early', 'early', 'replace']
Explanation: Out-of-order inserts are allowed, and the second write at timestamp 10 replaces the earlier value.
Input: ([('set', 'a', '1', 1), ('set', 'b', '2', 2), ('set', 'a', '3', 3), ('get', 'b', 2), ('get', 'a', 2), ('get', 'c', 5)],)
Expected Output: ['2', '1', None]
Explanation: Multiple keys are independent, and missing keys should return None.
Input: ([],)
Expected Output: []
Explanation: Edge case: no operations.
Solution
def solution(operations):
MASK = (1 << 64) - 1
def fnv1a(text):
h = 1469598103934665603
for ch in text:
h ^= ord(ch)
h = (h * 1099511628211) & MASK
return h
def mix64(x):
x = (x + 0x9E3779B97F4A7C15) & MASK
x = ((x ^ (x >> 30)) * 0xBF58476D1CE4E5B9) & MASK
x = ((x ^ (x >> 27)) * 0x94D049BB133111EB) & MASK
x ^= x >> 31
return x & MASK
class Node:
__slots__ = ('ts', 'value', 'prio', 'left', 'right')
def __init__(self, ts, value, prio):
self.ts = ts
self.value = value
self.prio = prio
self.left = None
self.right = None
def rotate_right(y):
x = y.left
y.left = x.right
x.right = y
return x
def rotate_left(x):
y = x.right
x.right = y.left
y.left = x
return y
def insert(node, ts, value, prio):
if node is None:
return Node(ts, value, prio)
if ts == node.ts:
node.value = value
return node
if ts < node.ts:
node.left = insert(node.left, ts, value, prio)
if node.left.prio < node.prio:
node = rotate_right(node)
else:
node.right = insert(node.right, ts, value, prio)
if node.right.prio < node.prio:
node = rotate_left(node)
return node
def floor_get(node, ts):
ans = None
cur = node
while cur is not None:
if ts < cur.ts:
cur = cur.left
else:
ans = cur.value
cur = cur.right
return ans
roots = {}
seeds = {}
out = []
for op in operations:
if op[0] == 'set':
_, key, value, ts = op
if key not in seeds:
seeds[key] = fnv1a(key)
prio = mix64(ts ^ seeds[key])
roots[key] = insert(roots.get(key), ts, value, prio)
else:
_, key, ts = op
out.append(floor_get(roots.get(key), ts))
return out
Time complexity: Expected O(log v) per set and O(log v) per get for a key with v stored versions, using a treap per key.. Space complexity: O(total_versions) across all keys..
Hints
- For each key, keep versions ordered by timestamp and answer get with a predecessor/floor search.
- If duplicate timestamps are allowed, do not create two versions with the same timestamp for one key; overwrite the old value.