Streaming Aggregation And Top-K Selection
Asked of: Software Engineer
Last updated

What's being tested
This tests streaming aggregation plus top-k selection over changing per-customer totals. You need to aggregate events into a customer_id -> revenue map, then return the k smallest totals efficiently under different read/write patterns.
Patterns & templates
-
Hash-map aggregation — maintain
totals[customer_id] += revenueinO(1)average update time; handle negative, zero, duplicate, and missing revenue carefully. -
Size-k max-heap for least-k — scan aggregates, keep heap of k largest among current smallest;
O(n log k)time,O(k)space. -
Quickselect — partition totals to find kth-smallest in average
O(n)time; useful for one-shot queries but mutates arrays and has worst-caseO(n^2). -
Balanced tree / sorted map — maintain
(revenue, customer_id)ordering with delete+insert per update;O(log n)writes,O(k)reads. -
Two-index design — store both
customer_id -> revenueand ordered(revenue, customer_id)entries; update requires removing stale pair before inserting new pair. -
Read/write tradeoff — heap-on-query favors heavy writes/light reads; maintained ordered set favors frequent least-k reads with moderate update volume.
-
Tie-breaking — define deterministic ordering, usually
(revenue ASC, customer_id ASC), to avoid flaky tests and unstable output.
Common pitfalls
Pitfall: Sorting all customers for every query is
O(n log n)and often fails when k is small or queries are frequent.
Pitfall: Updating revenue in a tree without deleting the old
(revenue, customer_id)entry leaves duplicate stale values.
Pitfall: Using a min-heap of all customers gives easy reads but awkward arbitrary updates unless you support lazy deletion or indexed heap bookkeeping.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Practice questions
Related concepts
- Top-K Queries And Streaming AggregationCoding & Algorithms
- Real-Time Top-K And Streaming AnalyticsSystem Design
- Top-K Selection, Heaps, And RankingCoding & Algorithms
- Heaps, Top-K, And Streaming SelectionCoding & Algorithms
- Top-K Selection And Order StatisticsCoding & Algorithms
- Heaps, Priority Queues, and Top-K SelectionCoding & Algorithms