Design streaming stats with sliding window
Company: Akuna Capital
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Design a data structure that ingests an integer stream and supports online queries for maximum, mean, and mode.
1) Describe your update and query operations and their time/space complexity.
2) If values are guaranteed to be in the range [1, 1001], propose an exact solution and compute the memory required to maintain the mode.
3) If the value domain is unbounded or memory is constrained, describe how you would approximate the mode, including the accuracy trade-offs.
4) Extend your design to support returning the max, mean, and mode over only the most recent k elements (a sliding window). Explain how you maintain these statistics as the window slides, analyze the complexity, and discuss how you handle ties and empty-window cases.
Quick Answer: Design streaming stats with sliding window evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.
Solution
# Solution Alignment
The prompt asks for an implementation-level answer. The safest way to present it is to define the state, maintain clear invariants, then walk through complexity and tests.
## Problem Restatement
Design a data structure that ingests an integer stream and supports online queries for maximum, mean, and mode. 1) Describe your update and query operations and their time/space complexity. 2) If values are guaranteed to be in the range [1, 1001], propose an exact solution and compute the memory required to maintain the mode. 3) If the value domain is unbounded or memory is constrained, describe how you would approximate the mode, including the accuracy trade-offs. 4) Extend your design to support returning the max, mean, and mode over only the most recent k elements (a sliding window). Explain how you maintain these statistics as the window slides, analyze the complexity, and discuss how yo...
## Recommended Approach
Start with a brute-force baseline to confirm correctness, then identify the repeated work or ordering property that enables a better data structure such as a hash map, heap, stack, queue, two pointers, prefix sums, BFS/DFS, or dynamic programming. Write the implementation around a small invariant and test that invariant directly.
## Correctness
The implementation should maintain an invariant after each loop or operation that directly matches the problem statement. At termination, that invariant implies the returned value has considered every valid candidate exactly once, or has preserved the required data-structure state after every API call.
## Complexity
State the baseline complexity and the optimized complexity. For most interview constraints, justify why the optimized approach meets the expected input size.
## Edge Cases and Tests
Empty and singleton inputs, duplicates, ties, invalid inputs, boundary values, and tests that exercise the main invariant.