Design a temporal key-value store with historical reads
Company: Lyft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's grasp of data structures and algorithms needed to implement a temporal key-value store supporting timestamped set/get operations and efficient historical reads, including performance targets like O(log n) per operation.
Constraints
- 0 <= len(operations) <= 10^6
- There may be up to 10^5 distinct keys
- Timestamps are integers and may arrive in any order
- If the same (key, timestamp) is set more than once, the newest value overwrites the previous one
Examples
Input: [('set', 'foo', 'bar', 1), ('get', 'foo', 1), ('get', 'foo', 3), ('set', 'foo', 'bar2', 4), ('get', 'foo', 4), ('get', 'foo', 5)]
Expected Output: ['bar', 'bar', 'bar2', 'bar2']
Explanation: At time 1 and 3, the latest value is 'bar'. After setting timestamp 4, queries at 4 and 5 return 'bar2'.
Input: []
Expected Output: []
Explanation: No operations means there are no get results to return.
Input: [('set', 'a', 'x', 5), ('set', 'a', 'y', 2), ('get', 'a', 1), ('get', 'a', 2), ('get', 'a', 4), ('get', 'a', 5), ('get', 'a', 100)]
Expected Output: ['', 'y', 'y', 'x', 'x']
Explanation: The timestamps were inserted out of order. Historical lookups must still return the nearest timestamp not greater than the query time.
Input: [('set', 'k1', 'v1', 10), ('set', 'k2', 'a', 3), ('set', 'k1', 'v2', 10), ('get', 'k1', 10), ('get', 'k2', 2), ('get', 'k2', 3), ('get', 'k1', 9)]
Expected Output: ['v2', '', 'a', '']
Explanation: Setting ('k1', 10) twice overwrites the old value. Queries before the first timestamp for a key return an empty string.
Input: [('set', 'user1', 'on', 7), ('set', 'user1', 'off', 12), ('set', 'user2', 'red', 5), ('set', 'user1', 'idle', 9), ('get', 'user1', 10), ('get', 'user1', 8), ('get', 'user2', 6), ('get', 'user3', 1)]
Expected Output: ['idle', 'on', 'red', '']
Explanation: For user1, the latest value at time 10 is from timestamp 9, and at time 8 it is from timestamp 7. Missing keys return an empty string.
Hints
- Each get operation is a predecessor query: find the largest stored timestamp <= the requested timestamp for that key.
- A hash map gets you to the correct key quickly, but you still need an ordered structure per key. If updates are out of order, think about a balanced BST or treap rather than re-sorting on every insert.