You are working with a large collection of items represented by integer IDs (e.g., product IDs, user IDs, etc.). Updates and queries arrive over time.
You need to support the following operations:
-
add(x)
: record one occurrence of item
x
(increment its frequency count).
-
topK(k)
: return any ordering of the
k
most frequent items seen so far, along with their frequencies.
Design data structures and algorithms to support these operations, and discuss different approaches under two workload patterns:
-
Read-heavy workload
-
The system receives relatively
few
add(x)
updates but
many
topK(k)
queries.
-
Example: A daily batch job loads all events once, and then the UI serves thousands of top-K queries per minute.
-
Write-heavy workload
-
The system receives
many
add(x)
updates but relatively
few
topK(k)
queries.
-
Example: A logging system where events stream in continuously, and an analyst occasionally queries the current top-K items.
Requirements
-
For each workload type (read-heavy and write-heavy):
-
Propose suitable data structures to implement
add(x)
and
topK(k)
.
-
Analyze the time complexity of each operation (
add
and
topK
).
-
Analyze the space complexity.
-
Explain the trade-offs and why your design is better suited to that workload pattern.
-
You may assume that:
-
The domain of
x
(possible IDs) is large.
-
k
is much smaller than the total number of distinct items.
-
The system should scale to millions of updates.
Example
Given the sequence of operations:
-
add(1)
-
add(2)
-
add(1)
-
add(3)
-
add(2)
-
add(1)
A call to topK(2) should return items [1, 2] (since 1 appeared 3 times, 2 appeared 2 times, 3 appeared 1 time).
Task
Describe, in detail:
-
At least one design optimized for
read-heavy
scenarios.
-
At least one design optimized for
write-heavy
scenarios.
-
How you would adapt or approximate the solution if the data is a never-ending stream and you must use limited memory (high level is sufficient).
Implementation (code) is optional; focus primarily on explaining data structures, algorithms, and their trade-offs.