Compute 5-minute rolling average load
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of sliding-window aggregation and efficient data-structure design for time-series event streams, focusing on computing a 5-minute rolling average load while managing performance and memory under high request volume.
Constraints
- `0 <= len(ops) == len(times) <= 2 * 10^5`
- `0 <= times[i] <= 10^9`
- `times[i] <= times[i + 1]` for all valid `i`
- `ops[i]` is either `"record"` or `"avgLoad"`
Examples
Input: (["record","record","record","record","avgLoad","avgLoad"], [10,20,300,301,301,600])
Expected Output: [0.013333333333333334, 0.0033333333333333335]
Explanation: At time 301, all requests at 10, 20, 300, and 301 are inside (1, 301], so the average is 4/300. At time 600, only the request at 301 is inside (300, 600], so the average is 1/300.
Input: (["avgLoad"], [100])
Expected Output: [0.0]
Explanation: No requests have been recorded, so the rolling average is 0.0.
Input: (["record","record","record","avgLoad","avgLoad"], [100,400,400,400,700])
Expected Output: [0.006666666666666667, 0.0]
Explanation: At time 400, the window is (100, 400], so the request at 100 is excluded and the two requests at 400 are counted: 2/300. At time 700, the window is (400, 700], so the requests at 400 are also excluded, giving 0.0.
Input: (["avgLoad","record","record","avgLoad","avgLoad","record","avgLoad"], [5,5,5,5,304,304,305])
Expected Output: [0.0, 0.006666666666666667, 0.006666666666666667, 0.0033333333333333335]
Explanation: Operations at the same timestamp are processed in order. The first query at 5 sees no requests. After two records at 5, the next query at 5 returns 2/300. The query at 304 still includes both requests at 5 because 5 is inside (4, 304]. After recording at 304, the query at 305 excludes time 5 and counts only time 304, giving 1/300.
Input: ([], [])
Expected Output: []
Explanation: There are no operations, so there are no `avgLoad` outputs.
Hints
- Since the window size is fixed at exactly 300 seconds, think about storing counts per second in a fixed-size circular structure instead of keeping all history.
- When you revisit the same bucket index, compare the stored timestamp with the current timestamp to decide whether to reset that bucket or add to it.