Design structure for top-K frequent elements

Q: Design structure for top-K frequent elements

This question evaluates a candidate's competency in algorithm and data-structure design, frequency counting and streaming/approximation techniques, and the ability to reason about time/space complexity and workload-driven trade-offs.

Q: How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

Question

You are working with a large collection of items represented by integer IDs (e.g., product IDs, user IDs, etc.). Updates and queries arrive over time.

You need to support the following operations:

add(x) : record one occurrence of item x (increment its frequency count).
topK(k) : return any ordering of the k most frequent items seen so far, along with their frequencies.

Design data structures and algorithms to support these operations, and discuss different approaches under two workload patterns:

Read-heavy workload
- The system receives relatively few add(x) updates but many topK(k) queries.
- Example: A daily batch job loads all events once, and then the UI serves thousands of top-K queries per minute.
Write-heavy workload
- The system receives many add(x) updates but relatively few topK(k) queries.
- Example: A logging system where events stream in continuously, and an analyst occasionally queries the current top-K items.

Requirements

For each workload type (read-heavy and write-heavy):
1. Propose suitable data structures to implement add(x) and topK(k) .
2. Analyze the time complexity of each operation ( add and topK ).
3. Analyze the space complexity.
4. Explain the trade-offs and why your design is better suited to that workload pattern.
You may assume that:
- The domain of x (possible IDs) is large.
- k is much smaller than the total number of distinct items.
- The system should scale to millions of updates.

Example

Given the sequence of operations:

add(1)
add(2)
add(1)
add(3)
add(2)
add(1)

A call to topK(2) should return items [1, 2] (since 1 appeared 3 times, 2 appeared 2 times, 3 appeared 1 time).

Task

Describe, in detail:

At least one design optimized for read-heavy scenarios.
At least one design optimized for write-heavy scenarios.
How you would adapt or approximate the solution if the data is a never-ending stream and you must use limited memory (high level is sufficient).

Implementation (code) is optional; focus primarily on explaining data structures, algorithms, and their trade-offs.

Design structure for top-K frequent elements

Overview

Requirements

Example

Task

Comments (0)