Design a Scalable Streaming Median Service Under Memory Constraints
Context
You are building a service that consumes a very large or unbounded stream of numeric values and must continuously report:
-
The median (exact if feasible, otherwise approximate), and
-
A "loose-median" interval: a value range [L, R] guaranteed to contain the true median.
The classic two-heap (max-heap for lower half, min-heap for upper half) approach requires storing all seen values and therefore exceeds memory when the stream is very large.
Tasks
-
Identify the bottlenecks of the in-memory two-heap approach for very large/unbounded streams.
-
Propose scalable designs that continue producing exact or approximate medians and the loose-median interval under memory constraints. Consider:
-
External-memory (disk-backed) algorithms
-
Bucketization/histograms
-
Quantile sketches (e.g., GK, KLL, t-digest)
-
Distributed aggregation
-
Windowed (e.g., sliding/tumbling) processing
-
Explain the trade-offs across accuracy, latency, memory, and I/O for each option.
-
Outline failure handling and back-pressure strategies suitable for a production service.
State any minimal assumptions you make (e.g., numeric domain, tolerance for approximation, latency SLOs).