Implement a sliding-window hit counter
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of time-based data structures and algorithms for sliding-window counters, including fixed-size array bucketing, handling timestamp collisions within buckets, refactoring for reuse, and time/space complexity analysis.
Constraints
- 1 <= window_size <= 10^5
- 0 <= len(operations) <= 2 * 10^5
- Timestamps are non-decreasing across all operations
- For every ('get', t, past_seconds), 1 <= past_seconds <= window_size
- There may be multiple hits at the same timestamp
Examples
Input: (5, [('hit', 1), ('hit', 2), ('hit', 2), ('get', 2, 2), ('get', 5, 5), ('hit', 7), ('get', 7, 5), ('get', 7, 1)])
Expected Output: [3, 3, 1, 1]
Explanation: Hits occur at times 1, 2, and 2. At t=2 over the last 2 seconds, all 3 hits count. At t=5 over the last 5 seconds, those same 3 hits still count. Recording a hit at 7 reuses the bucket for timestamp 2, but the old value is stale and is overwritten. At t=7, only the hit at 7 is inside the last 5 seconds and the last 1 second.
Input: (3, [('get', 1, 1), ('hit', 1), ('hit', 1), ('get', 1, 1), ('get', 3, 3), ('hit', 4), ('get', 4, 3)])
Expected Output: [0, 2, 2, 1]
Explanation: The first query happens before any hits, so it returns 0. Two hits are recorded at timestamp 1. They are both counted at t=1 for the last second and at t=3 for the last 3 seconds. After a hit at 4, only that new hit remains inside the range [2, 4].
Input: (4, [('hit', 1), ('hit', 5), ('get', 5, 4), ('hit', 5), ('get', 5, 1), ('get', 6, 2)])
Expected Output: [1, 2, 2]
Explanation: Timestamps 1 and 5 map to the same bucket because 1 % 4 == 5 % 4. The hit at 5 overwrites the stale data from 1. After another hit at 5, the last 1 second contains 2 hits, and at t=6 the last 2 seconds still contain those same 2 hits.
Input: (1, [('get', 10, 1), ('hit', 10), ('get', 10, 1), ('hit', 11), ('get', 11, 1), ('get', 12, 1)])
Expected Output: [0, 1, 1, 0]
Explanation: With a window size of 1, the counter only remembers the current second. The query before any hit returns 0. A hit at 10 is counted at t=10, then overwritten by the hit at 11. By t=12, no hit exists in the last 1 second.
Solution
def solution(window_size, operations):
times = [-1] * window_size
counts = [0] * window_size
results = []
def record_hit(timestamp):
idx = timestamp % window_size
if times[idx] == timestamp:
counts[idx] += 1
else:
times[idx] = timestamp
counts[idx] = 1
def get_hits(current_timestamp, past_seconds):
total = 0
for i in range(window_size):
ts = times[i]
if ts != -1 and 0 <= current_timestamp - ts < past_seconds:
total += counts[i]
return total
for op in operations:
if op[0] == 'hit':
_, timestamp = op
record_hit(timestamp)
elif op[0] == 'get':
_, timestamp, past_seconds = op
results.append(get_hits(timestamp, past_seconds))
else:
raise ValueError('Unknown operation type: {}'.format(op[0]))
return resultsTime complexity: recordHit: O(1), getHits: O(window_size) by scanning the fixed array; if window_size is treated as a fixed constant such as 300, each operation is O(1). Space complexity: O(window_size).
Hints
- Map each timestamp t to a bucket using t % window_size.
- A bucket needs both a stored timestamp and a count. If the stored timestamp is different from the current one, the old count is stale and should be replaced.