How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a hard difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Bloomberg.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Bloomberg during technical interviews.

Design a data structure for dynamic top‑K frequency

Company: Bloomberg

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

Design a data structure that maintains a multiset of integers and supports the following operations efficiently: ### Operations - `insert(x)`: Add one occurrence of value `x`. - `remove(x)`: Remove one occurrence of value `x` if it exists (if `x` is not present, do nothing). - `topK(k) -> List[int]`: Return the **k most frequent** values currently in the multiset. ### Requirements / Clarifications - Frequency means the current count of each value in the multiset. - If fewer than `k` distinct values exist, return all distinct values. - If multiple values have the same frequency, you may return them in any order (or state and implement a deterministic tie-break such as smaller value first). - Discuss time and space complexity for each operation. - Explain why your chosen data structures are appropriate compared with alternatives (e.g., sorting on every query, balanced BST/TreeMap, etc.). - Consider edge cases (removing down to zero, repeated inserts/removes, large `k`, empty structure). - Bonus discussion: how would your approach change if the number of total updates is extremely large (e.g., up to billions)?

Quick Answer: This question evaluates data structure and algorithm design skills, focusing on frequency counting, dynamic maintenance of top-K elements, and efficient update and query operations in a multiset within the Coding & Algorithms domain.

You are given two parallel arrays, `operations` and `values`, that describe actions on an initially empty multiset of integers. Process the actions in order and return the result of every `topK` query. Operations: - `insert(x)`: add one occurrence of integer `x`. - `remove(x)`: remove one occurrence of `x` if it exists; otherwise do nothing. - `topK(k)`: return the `k` most frequent distinct values currently in the multiset. For this problem, `topK(k)` must return values ordered by: 1. higher frequency first 2. if frequencies are equal, smaller value first If fewer than `k` distinct values exist, return all distinct values. A value whose count becomes zero should no longer appear in future results. Your function should return a list containing the answer for each `topK` operation, in the order those queries appear. An efficient solution is required; recomputing and sorting all frequencies on every query will be too slow for the given constraints.

Constraints

1 <= len(operations) == len(values) <= 200000
operations[i] is one of "insert", "remove", or "topK"
-10^9 <= values[i] <= 10^9 for insert/remove operations
1 <= k <= len(operations) for topK operations
Tie-breaking for equal frequencies must be deterministic: smaller value first

Examples

Input: (["insert", "insert", "insert", "insert", "insert", "topK", "remove", "topK"], [5, 7, 5, 7, 5, 2, 5, 2])

Expected Output: [[5, 7], [5, 7]]

Explanation: After five inserts, frequencies are {5: 3, 7: 2}, so top 2 are [5, 7]. After removing one 5, frequencies become {5: 2, 7: 2}; tie is broken by smaller value first, so [5, 7].

Input: (["insert", "insert", "insert", "remove", "topK", "remove", "topK"], [-1, -2, -1, -3, 3, -1, 3])

Expected Output: [[-1, -2], [-2, -1]]

Explanation: Removing -3 does nothing because it is absent. First query sees {-1: 2, -2: 1}. After removing one -1, both values have frequency 1, so smaller value -2 comes first.

Input: (["topK", "insert", "remove", "topK"], [5, 10, 10, 1])

Expected Output: [[], []]

Explanation: The structure is empty for the first query. After inserting and then removing 10, it is empty again for the last query.

Input: (["insert", "insert", "insert", "insert", "topK", "remove", "remove", "topK", "insert", "topK"], [1, 2, 2, 3, 3, 2, 2, 3, 1, 3])

Expected Output: [[2, 1, 3], [1, 3], [1, 3]]

Explanation: Initially frequencies are {1: 1, 2: 2, 3: 1}, so [2, 1, 3]. Removing 2 twice eliminates it, leaving {1: 1, 3: 1}, so [1, 3]. Inserting 1 again gives {1: 2, 3: 1}, so [1, 3].

Input: (["insert", "insert", "insert", "insert", "insert", "topK", "topK", "remove", "topK", "insert", "topK"], [4, 4, 6, 6, 5, 2, 3, 4, 2, 5, 3])

Expected Output: [[4, 6], [4, 6, 5], [6, 4], [5, 6, 4]]

Explanation: Repeated topK queries must preserve the structure. Ties are always resolved by smaller value first, which determines [4, 6] and later [5, 6, 4].

Hints

Use a hash map to store the current frequency of each value. Then think about a data structure that can quickly expose the highest-frequency candidates.
A max-heap with lazy deletion works well: push updated snapshots into the heap, and when answering `topK`, discard heap entries that no longer match the current frequency.

Quick Overview

This question evaluates data structure and algorithm design skills, focusing on frequency counting, dynamic maintenance of top-K elements, and efficient update and query operations in a multiset within the Coding & Algorithms domain.