Design LFU cache with distributed extension

Q: Design LFU cache with distributed extension

This is a technical interview question from Amazon for Machine Learning Engineer roles. View the full question and solution on PracHub.

Question

Problem

You are asked to design and implement a data structure that behaves like an in-memory cache with a Least Frequently Used (LFU) eviction policy. Then, you will discuss how to extend the idea to a distributed, streaming setting.

Part A: Single-node LFU cache

Design a class LFUCache with the following API:

Constructor: LFUCache(int capacity)
- capacity is a positive integer indicating the maximum number of key–value pairs the cache can hold.
Method: int get(int key)
- If the key exists in the cache, return its value and update its usage frequency.
- If the key does not exist, return -1 .
Method: void put(int key, int value)
- Insert or update the value of the key.
- If inserting a new key would exceed capacity , you must evict one key according to the LFU policy.

Eviction rules:

Evict the key with the lowest access frequency .
If multiple keys share the same (lowest) frequency, evict the one that is least recently used among them.

Requirements:

Aim for O(1) amortized time for both get and put operations.
Clearly describe the data structures you choose and how they support:
- Updating a key’s frequency on get and put .
- Finding and evicting the correct key in O(1).

You may assume:

capacity >= 0 .
Keys and values are integers that fit in standard 32-bit types.

Part B: Distributed / streaming extension

Now assume that instead of a single process, you are dealing with a high-volume data stream of keys (e.g., user IDs, item IDs, or search queries) spread across multiple machines. The input rate is too high to store all keys and their counts exactly on a single node.

You want to extend the LFU idea so that you can:

Continuously ingest a (potentially unbounded) stream of keys across many machines .
Approximate the set of most frequently used keys (LFU-like behavior) over a recent time window or over all time.
Support queries such as: "What are the current top K most frequent keys?" with reasonable accuracy.

Questions for the distributed extension:

What data structures or algorithms would you use on each machine to track local frequencies under memory constraints (e.g., sketches, approximate counting, bounded-size summaries)?
How would you periodically merge local summaries to obtain a global view of the most frequent keys?
How would you handle:
- Late or out-of-order events in the stream?
- Sliding windows or time-based decay (so that recent events matter more)?
What trade-offs do you make between accuracy , memory usage , and latency , and why?

You do not need to draw full system architecture diagrams, but your answer should make it clear how the LFU concept is adapted to a distributed, streaming environment.

Design LFU cache with distributed extension

Problem

Part A: Single-node LFU cache

Part B: Distributed / streaming extension

Comments (0)