Compute top-K branches by openings
Company: Coinbase
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
Given a stream of account-opening events (branch_id, user_id, timestamp), design algorithms and data structures to compute the top-K branches by number of openings for
(a) all time and
(b) a sliding window of the last T minutes. Support event ingestion in near real time, queries at any time, and updates in O(log K) or better. Address late or out-of-order events, memory usage, and trade-offs between exact and approximate methods (e.g., heaps + hash maps vs. sketches). Extend your design to a distributed environment and analyze time and space complexity.
Quick Answer: This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Compute top-K branches by openings states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.
Solution
# Solution Alignment
The prompt asks for an implementation-level answer. The safest way to present it is to define the state, maintain clear invariants, then walk through complexity and tests.
## Problem Restatement
Given a stream of account-opening events (branch_id, user_id, timestamp), design algorithms and data structures to compute the top-K branches by number of openings for (a) all time and (b) a sliding window of the last T minutes. Support event ingestion in near real time, queries at any time, and updates in O(log K) or better. Address late or out-of-order events, memory usage, and trade-offs between exact and approximate methods (e.g., heaps + hash maps vs. sketches). Extend your design to a distributed environment and analyze time and space complexity.
## Recommended Approach
For one-time top-K, use a size-K min-heap or quickselect plus sorting the selected K. For streaming windows, maintain counts in a hash map plus a heap with lazy deletion or bucketed frequency structures when updates must be near O(1). Define deterministic tie-breaking.
## Correctness
The implementation should maintain an invariant after each loop or operation that directly matches the problem statement. At termination, that invariant implies the returned value has considered every valid candidate exactly once, or has preserved the required data-structure state after every API call.
## Complexity
One-time heap: O(n log k) time and O(k) space. Quickselect: expected O(n) plus O(k log k) to order output. Streaming complexity depends on window eviction and tie-breaking.
## Edge Cases and Tests
k = 0, k > n, duplicate values, ties, negative values, stale heap entries, and deterministic output ordering.