Top-K Queries And Streaming Aggregation
Asked of: Software Engineer
Last updated

What's being tested
This tests streaming aggregation and Top-K query design: maintaining counts, sums, balances, or revenues while processing events incrementally. Interviewers are probing whether you can choose between full sort, heap, ordered set, hash map aggregation, and time-window indexing while preserving correctness under ties, updates, and edge cases.
Patterns & templates
-
Hash map aggregation — use
dict[key] += valuefor counts, revenue, balances, or per-restaurant totals;O(n)build,O(m)space. -
Top-K via min-heap — maintain heap of size
k;O(n log k)time, better than fullO(n log n)sorting whenk << n. -
Top-K via sorting — aggregate first, then
sorted(items, key=(-metric, tie_breaker))[:k]; simple, deterministic, acceptable for moderatem. -
Ordered ranking with ties — define comparator explicitly: higher metric first, then lexicographic ID, timestamp, or insertion order; avoid nondeterministic output.
-
Time-window aggregation — filter by
start <= ts < end, or maintain deque/prefix sums for repeated range queries; watch inclusive/exclusive boundaries. -
Streaming updates — for changing scores, use lazy heap entries
(score, id, version)and discard stale records on pop; avoids expensive heap deletion. -
SQL Top-K template —
GROUP BY entity, computeSUM(...), thenORDER BY metric DESC, entity ASC LIMIT k; useROW_NUMBER()for per-group Top-K.
Common pitfalls
Pitfall: Computing Top-K directly on raw events instead of aggregating by entity first; this ranks orders, not restaurants/users/accounts.
Pitfall: Ignoring tie-breaking;
Coinbase-style coding prompts often expect deterministic output even when metrics are equal.
Pitfall: Using full recomputation for every query when repeated streaming queries require incremental maps, heaps, prefix sums, or window indexes.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Practice questions
- Implement top-K over a streamCoinbase · Software Engineer · Take-home Project · Medium
- Implement banking ops: transfer, top-k, cashback, mergeCoinbase · Software Engineer · Take-home Project · Medium
- Compute top-K branches by openingsCoinbase · Software Engineer · Take-home Project · Medium
- Process food delivery data queriesCoinbase · Software Engineer · Onsite · Medium
- Analyze food-delivery dataCoinbase · Software Engineer · Onsite · Medium
- Compute prices, distances, and Top-K for ordersCoinbase · Software Engineer · Onsite · Medium
- Compute delivery metrics and top-K queriesCoinbase · Software Engineer · Onsite · Medium
Related concepts
- Streaming Aggregation And Top-K SelectionCoding & Algorithms
- Real-Time Top-K And Streaming AnalyticsSystem Design
- Top-K Ranking And SelectionSystem Design
- Top-K Selection And Order StatisticsCoding & Algorithms
- Top-K Frequency TrackingCoding & Algorithms
- Heaps, Priority Queues, and Top-K SelectionCoding & Algorithms