PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Amazon

Design LFU cache with distributed extension

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of cache design and frequency-based eviction policies, efficient data structures that support O(1) operations, and techniques for approximate frequency estimation and summarization in distributed streaming environments.

  • medium
  • Amazon
  • Coding & Algorithms
  • Machine Learning Engineer

Design LFU cache with distributed extension

Company: Amazon

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

### Problem You are asked to design and implement a data structure that behaves like an in-memory cache with a **Least Frequently Used (LFU)** eviction policy. Then, you will discuss how to extend the idea to a distributed, streaming setting. #### Part A: Single-node LFU cache Design a class `LFUCache` with the following API: - Constructor: `LFUCache(int capacity)` - `capacity` is a positive integer indicating the maximum number of key–value pairs the cache can hold. - Method: `int get(int key)` - If the key exists in the cache, return its value and update its usage frequency. - If the key does not exist, return `-1`. - Method: `void put(int key, int value)` - Insert or update the value of the key. - If inserting a new key would exceed `capacity`, you must evict **one** key according to the LFU policy. Eviction rules: 1. Evict the key with the **lowest access frequency**. 2. If multiple keys share the same (lowest) frequency, evict the one that is **least recently used** among them. **Requirements:** - Aim for **O(1)** amortized time for both `get` and `put` operations. - Clearly describe the data structures you choose and how they support: - Updating a key’s frequency on `get` and `put`. - Finding and evicting the correct key in O(1). You may assume: - `capacity >= 0`. - Keys and values are integers that fit in standard 32-bit types. #### Part B: Distributed / streaming extension Now assume that instead of a single process, you are dealing with a **high-volume data stream** of keys (e.g., user IDs, item IDs, or search queries) spread across multiple machines. The input rate is too high to store all keys and their counts exactly on a single node. You want to extend the LFU idea so that you can: - Continuously ingest a (potentially unbounded) stream of keys across **many machines**. - Approximate the set of **most frequently used keys** (LFU-like behavior) over a recent time window or over all time. - Support queries such as: "What are the current top *K* most frequent keys?" with reasonable accuracy. **Questions for the distributed extension:** 1. What data structures or algorithms would you use on each machine to track local frequencies under memory constraints (e.g., sketches, approximate counting, bounded-size summaries)? 2. How would you periodically **merge** local summaries to obtain a global view of the most frequent keys? 3. How would you handle: - Late or out-of-order events in the stream? - Sliding windows or time-based decay (so that recent events matter more)? 4. What trade-offs do you make between **accuracy**, **memory usage**, and **latency**, and why? You do **not** need to draw full system architecture diagrams, but your answer should make it clear how the LFU concept is adapted to a distributed, streaming environment.

Quick Answer: This question evaluates understanding of cache design and frequency-based eviction policies, efficient data structures that support O(1) operations, and techniques for approximate frequency estimation and summarization in distributed streaming environments.

Related Interview Questions

  • Implement Datacenter Router Commands - Amazon (hard)
  • Replace Delimited Tokens in a String - Amazon (medium)
  • Minimize Circular Redistribution Cost - Amazon (medium)
  • Find the Most Common Visit Pattern - Amazon (hard)
  • Maximize Value Under a Budget - Amazon (medium)
Amazon logo
Amazon
Nov 18, 2025, 12:00 AM
Machine Learning Engineer
Onsite
Coding & Algorithms
2
0

Problem

You are asked to design and implement a data structure that behaves like an in-memory cache with a Least Frequently Used (LFU) eviction policy. Then, you will discuss how to extend the idea to a distributed, streaming setting.

Part A: Single-node LFU cache

Design a class LFUCache with the following API:

  • Constructor: LFUCache(int capacity)
    • capacity is a positive integer indicating the maximum number of key–value pairs the cache can hold.
  • Method: int get(int key)
    • If the key exists in the cache, return its value and update its usage frequency.
    • If the key does not exist, return -1 .
  • Method: void put(int key, int value)
    • Insert or update the value of the key.
    • If inserting a new key would exceed capacity , you must evict one key according to the LFU policy.

Eviction rules:

  1. Evict the key with the lowest access frequency .
  2. If multiple keys share the same (lowest) frequency, evict the one that is least recently used among them.

Requirements:

  • Aim for O(1) amortized time for both get and put operations.
  • Clearly describe the data structures you choose and how they support:
    • Updating a key’s frequency on get and put .
    • Finding and evicting the correct key in O(1).

You may assume:

  • capacity >= 0 .
  • Keys and values are integers that fit in standard 32-bit types.

Part B: Distributed / streaming extension

Now assume that instead of a single process, you are dealing with a high-volume data stream of keys (e.g., user IDs, item IDs, or search queries) spread across multiple machines. The input rate is too high to store all keys and their counts exactly on a single node.

You want to extend the LFU idea so that you can:

  • Continuously ingest a (potentially unbounded) stream of keys across many machines .
  • Approximate the set of most frequently used keys (LFU-like behavior) over a recent time window or over all time.
  • Support queries such as: "What are the current top K most frequent keys?" with reasonable accuracy.

Questions for the distributed extension:

  1. What data structures or algorithms would you use on each machine to track local frequencies under memory constraints (e.g., sketches, approximate counting, bounded-size summaries)?
  2. How would you periodically merge local summaries to obtain a global view of the most frequent keys?
  3. How would you handle:
    • Late or out-of-order events in the stream?
    • Sliding windows or time-based decay (so that recent events matter more)?
  4. What trade-offs do you make between accuracy , memory usage , and latency , and why?

You do not need to draw full system architecture diagrams, but your answer should make it clear how the LFU concept is adapted to a distributed, streaming environment.

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon Coding & Algorithms•Machine Learning Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.