Implement a rate-limited hit counter
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's understanding of designing efficient time-windowed data structures and algorithmic analysis, including space-time trade-offs and amortized complexity.
Constraints
- 0 <= len(operations) <= 10^5
- len(operations) == len(timestamps)
- operations[i] is either "hit" or "getHits"
- 1 <= timestamps[i] <= 10^9 when the input is non-empty
- timestamps is in non-decreasing order
Examples
Input: (["hit", "hit", "hit", "getHits", "hit", "getHits", "getHits"], [1, 2, 3, 4, 300, 300, 301])
Expected Output: [3, 4, 3]
Explanation: At time 4, the three earlier hits are counted. At time 300, hits at 1, 2, 3, and 300 are all within [1, 300]. At time 301, the hit at 1 expires, so only hits at 2, 3, and 300 remain.
Input: (["hit", "hit", "hit", "getHits", "hit", "getHits"], [1, 1, 1, 1, 300, 300])
Expected Output: [3, 4]
Explanation: Three hits happen at the same second, so getHits(1) returns 3. At time 300, the window is [1, 300], so those three hits plus the hit at 300 are counted.
Input: (["hit", "hit", "getHits", "getHits", "hit", "getHits"], [1, 300, 300, 301, 601, 601])
Expected Output: [2, 1, 1]
Explanation: At time 300, both hits at 1 and 300 are included. At time 301, the hit at 1 is outside [2, 301]. At time 601, the hit at 300 has also expired, leaving only the hit at 601.
Input: (["getHits", "hit", "getHits"], [10, 10, 10])
Expected Output: [0, 1]
Explanation: Before any hit is recorded, getHits returns 0. After recording one hit at time 10, getHits(10) returns 1.
Input: ([], [])
Expected Output: []
Explanation: With no operations, there are no getHits results to return.
Solution
def solution(operations, timestamps):
from collections import deque
if len(operations) != len(timestamps):
raise ValueError('operations and timestamps must have the same length')
window = deque() # (timestamp, count_at_that_second)
total = 0
answers = []
for op, ts in zip(operations, timestamps):
# Keep only hits in [ts - 299, ts]
while window and window[0][0] <= ts - 300:
old_ts, old_count = window.popleft()
total -= old_count
if op == 'hit':
if window and window[-1][0] == ts:
last_ts, last_count = window[-1]
window[-1] = (last_ts, last_count + 1)
else:
window.append((ts, 1))
total += 1
elif op == 'getHits':
answers.append(total)
else:
raise ValueError('invalid operation: ' + str(op))
return answers
Time complexity: O(n) total, where n is the number of operations. Space complexity: O(min(n, 300)).
Hints
- Because timestamps never decrease, expired hits will always be removed from the oldest side first.
- Instead of storing every hit separately, group hits by timestamp and keep a running total of hits currently in the 300-second window.