Implement top-K over a stream
Company: Coinbase
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
Given a high-volume stream of events (e.g., account IDs from new account openings), design and implement a data structure that supports:
(
1) inserting an item,
(
2) returning the current top-K most frequent items, and
(
3) optionally returning the top-K over the last T minutes (sliding window). Discuss algorithm choices (heaps, bucket counting, count–min sketch with heap), time/space complexities, handling ties, changing K at query time, and window expiration. Outline a scalable distributed approach, including partitioning, partial aggregation, and merge semantics.
Quick Answer: This question evaluates a candidate's proficiency in stream-processing algorithms and scalable data structures for frequency estimation and top-K queries, emphasizing analysis of time/space trade-offs, sliding-window semantics, and distributed aggregation.