Design LFU cache with distributed extension
Company: Amazon
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates understanding of cache design and frequency-based eviction policies, efficient data structures that support O(1) operations, and techniques for approximate frequency estimation and summarization in distributed streaming environments.
Constraints
- 0 <= capacity
- Keys and values fit in 32-bit signed integers.
- Each operation is either ["put", key, value] or ["get", key].
- On eviction, remove the least frequently used key; break ties by least recently used.
- If capacity == 0, every put is a no-op and every get returns -1.
Examples
Input: (2, [["put", 1, 1], ["put", 2, 2], ["get", 1], ["put", 3, 3], ["get", 2], ["get", 3], ["put", 4, 4], ["get", 1], ["get", 3], ["get", 4]])
Expected Output: [None, None, 1, None, -1, 3, None, -1, 3, 4]
Explanation: Classic LeetCode trace. After get(1), key 1 has freq 2 and key 2 has freq 1, so put(3) evicts key 2. Later both keys 1 and 3 have freq 2; key 1 was accessed less recently, so put(4) evicts key 1.
Input: (0, [["put", 0, 0], ["get", 0]])
Expected Output: [None, -1]
Explanation: Zero capacity: the put stores nothing and the get returns -1.
Input: (1, [["put", 1, 1], ["get", 1], ["put", 2, 2], ["get", 1], ["get", 2]])
Expected Output: [None, 1, None, -1, 2]
Explanation: Capacity 1: inserting key 2 evicts key 1 (the only candidate) regardless of key 1's higher frequency, since the cache can hold only one entry.
Input: (2, [["put", 1, 1], ["put", 2, 2], ["get", 1], ["get", 1], ["get", 2], ["put", 3, 3], ["get", 1], ["get", 2], ["get", 3]])
Expected Output: [None, None, 1, 1, 2, None, 1, -1, 3]
Explanation: Key 1 reaches freq 3, key 2 reaches freq 2. put(3) evicts the lowest-frequency key, which is key 2.
Input: (3, [["put", 1, 10], ["put", 1, 20], ["get", 1], ["put", 2, 2], ["put", 3, 3], ["put", 4, 4], ["get", 2], ["get", 3], ["get", 4], ["get", 1]])
Expected Output: [None, None, 20, None, None, None, -1, 3, 4, 20]
Explanation: put on an existing key updates its value (10 -> 20) and bumps its frequency. Key 1 ends at freq 3; keys 2,3 at freq 1. The full-cache put(4) evicts the least-recently-used among the freq-1 keys, which is key 2.
Solution
def lfuCache(capacity, operations):
from collections import defaultdict, OrderedDict
class LFUCache:
def __init__(self, capacity):
self.capacity = capacity
self.key_to_val = {}
self.key_to_freq = {}
self.freq_to_keys = defaultdict(OrderedDict)
self.min_freq = 0
def _touch(self, key):
freq = self.key_to_freq[key]
del self.freq_to_keys[freq][key]
if not self.freq_to_keys[freq]:
del self.freq_to_keys[freq]
if self.min_freq == freq:
self.min_freq += 1
self.key_to_freq[key] = freq + 1
self.freq_to_keys[freq + 1][key] = None
def get(self, key):
if key not in self.key_to_val:
return -1
self._touch(key)
return self.key_to_val[key]
def put(self, key, value):
if self.capacity <= 0:
return
if key in self.key_to_val:
self.key_to_val[key] = value
self._touch(key)
return
if len(self.key_to_val) >= self.capacity:
evict_key, _ = self.freq_to_keys[self.min_freq].popitem(last=False)
del self.key_to_val[evict_key]
del self.key_to_freq[evict_key]
self.key_to_val[key] = value
self.key_to_freq[key] = 1
self.freq_to_keys[1][key] = None
self.min_freq = 1
cache = LFUCache(capacity)
result = []
for op in operations:
name = op[0]
if name == "get":
result.append(cache.get(op[1]))
elif name == "put":
cache.put(op[1], op[2])
result.append(None)
return resultTime complexity: O(1) average per get/put operation; O(M) total for M operations.. Space complexity: O(capacity) for the stored keys, values, frequency map, and per-frequency ordered key lists..
Hints
- Maintain three maps: key -> value, key -> frequency, and frequency -> ordered collection of keys at that frequency (insertion-ordered so the front is the least recently used).
- Track a running min_freq. When you bump a key's frequency, remove it from its old frequency bucket; if that bucket was the min_freq bucket and becomes empty, increment min_freq.
- On insertion into a full cache, evict the front (oldest) key of the min_freq bucket, then set min_freq back to 1 for the newly inserted key.
- An OrderedDict (or a doubly-linked list per frequency) gives O(1) insertion, removal, and oldest-key eviction, which is what keeps both operations O(1).