Compute last-5-minute QPS in memory
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates design of in-memory data structures and sliding-window time-series aggregation for real-time metrics, testing understanding of space/time trade-offs, streaming algorithms, and handling monotonic timestamps.
Part 1: Online In-Memory QPS Tracker
Constraints
- 0 <= len(operations) <= 200000
- 0 <= timestamp <= 10^9
- Timestamps are non-decreasing across all operations
- Each 'record' operation represents exactly one request
- QPS(t) = count of requests in (t-300, t] / 300
Examples
Input: [('record', 1), ('record', 2), ('record', 300), ('getQPS', 300), ('getQPS', 301)]
Expected Output: [0.01, 0.00667]
Explanation: At t=300, requests in (0,300] are 1,2,300 => 3/300 = 0.01. At t=301, request 1 falls out, so 2/300 = 0.00667.
Input: [('record', 100), ('record', 100), ('record', 400), ('getQPS', 400), ('getQPS', 401)]
Expected Output: [0.00333, 0.00333]
Explanation: At t=400, timestamps equal to 100 are excluded because the window is open on the left: (100,400]. Only timestamp 400 counts.
Input: [('getQPS', 50), ('getQPS', 350)]
Expected Output: [0.0, 0.0]
Explanation: Edge case: no requests were ever recorded.
Input: [('record', 10), ('record', 10), ('getQPS', 10), ('record', 310), ('getQPS', 310)]
Expected Output: [0.00667, 0.00333]
Explanation: At t=10 there are 2 requests in the window. At t=310, requests at 10 are exactly on the excluded boundary, so only timestamp 310 counts.
Solution
def solution(operations):
from collections import deque
recent = deque()
answer = []
for op, t in operations:
while recent and recent[0] <= t - 300:
recent.popleft()
if op == 'record':
recent.append(t)
elif op == 'getQPS':
answer.append(round(len(recent) / 300.0, 5))
else:
raise ValueError('unknown operation')
return answerTime complexity: O(n) total for n operations, or amortized O(1) per operation. Space complexity: O(k), where k is the number of requests still inside the last 5 minutes.
Hints
- Keep only timestamps that can still belong to the current 5-minute window.
- A deque works well because old requests leave from the front and new requests arrive at the back.
Part 2: Last-5-Minute QPS with a Sliding Window
Constraints
- 0 <= len(requests), len(queries) <= 200000
- 0 <= requests[i], queries[j] <= 10^9
- requests is sorted in non-decreasing order
- queries is sorted in non-decreasing order
- QPS(t) = count of requests in (t-300, t] / 300
Examples
Input: ([1, 2, 300], [300, 301])
Expected Output: [0.01, 0.00667]
Explanation: At t=300 all three requests count. At t=301, timestamp 1 falls outside the window.
Input: ([100, 100, 400], [399, 400, 700])
Expected Output: [0.00667, 0.00333, 0.0]
Explanation: At t=399 both 100s count. At t=400 they are excluded because the left boundary is open. At t=700, timestamp 400 is also excluded.
Input: ([], [1, 300])
Expected Output: [0.0, 0.0]
Explanation: Edge case: there are no requests.
Input: ([5, 5, 5], [])
Expected Output: []
Explanation: Edge case: there are no queries.
Solution
def solution(requests, queries):
n = len(requests)
left = 0
right = 0
answer = []
for t in queries:
while right < n and requests[right] <= t:
right += 1
while left < right and requests[left] <= t - 300:
left += 1
answer.append(round((right - left) / 300.0, 5))
return answerTime complexity: O(r + q), where r is the number of requests and q is the number of queries. Space complexity: O(1) extra space, excluding the output list.
Hints
- Use one pointer to expand the right side of the window and another to shrink the left side.
- Because queries are sorted, both pointers only move forward.
Part 3: Memory-Optimized QPS Tracker with 300 Buckets
Constraints
- 0 <= len(operations) <= 200000
- 0 <= timestamp <= 10^9
- Timestamps are non-decreasing across all operations
- Use only O(300) extra memory
- QPS(t) = count of requests in (t-300, t] / 300
Examples
Input: [('record', 1), ('record', 2), ('record', 300), ('getQPS', 300), ('getQPS', 301)]
Expected Output: [0.01, 0.00667]
Explanation: This matches the basic definition of QPS over the last 300 seconds.
Input: [('record', 100), ('record', 100), ('record', 400), ('getQPS', 400), ('getQPS', 401)]
Expected Output: [0.00333, 0.00333]
Explanation: The two requests at time 100 are excluded at t=400 because the interval is (100,400].
Input: [('getQPS', 50), ('record', 350), ('getQPS', 350)]
Expected Output: [0.0, 0.00333]
Explanation: Edge case: first query happens before any request is recorded.
Input: [('record', 0), ('record', 299), ('record', 300), ('record', 599), ('getQPS', 599), ('getQPS', 600)]
Expected Output: [0.00667, 0.00333]
Explanation: At t=599, timestamps 300 and 599 count. At t=600, timestamp 300 is exactly on the excluded left boundary.
Solution
def solution(operations):
times = [-1] * 300
counts = [0] * 300
answer = []
for op, t in operations:
if op == 'record':
idx = t % 300
if times[idx] != t:
times[idx] = t
counts[idx] = 1
else:
counts[idx] += 1
elif op == 'getQPS':
total = 0
for i in range(300):
if times[i] != -1 and 0 <= t - times[i] < 300:
total += counts[i]
answer.append(round(total / 300.0, 5))
else:
raise ValueError('unknown operation')
return answerTime complexity: O(1) for each record and O(300) for each getQPS, which is O(1) because 300 is a fixed constant. Space complexity: O(300).
Hints
- Since timestamps are whole seconds, at most 300 distinct second-values can matter for any query.
- Use index = timestamp % 300, but also store the absolute timestamp in each bucket so you can tell whether a bucket is stale.