Design O(1) Randomized Multiset
Company: LinkedIn
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
Quick Answer: This question evaluates data-structure design skills, understanding of hashing and randomization for uniform sampling, handling of duplicate elements, and analysis of average-case time complexity, with an added focus on managing time-to-live (TTL) expirations.
Part 1: Process a Randomized Multiset with Duplicates
Constraints
- 1 <= len(operations) <= 200000
- -10^9 <= val <= 10^9
- 0 <= r <= 10^18
- Your design should achieve O(1) average or amortized work per operation
Examples
Input: [('insert', 1), ('insert', 1), ('insert', 2), ('getRandom', 0), ('remove', 1), ('getRandom', 1), ('remove', 1), ('getRandom', 5)]
Expected Output: [True, False, True, 1, True, 2, True, 2]
Explanation: After the first three inserts, the active array contains two 1s and one 2. Removals delete the most recently inserted active 1 each time.
Input: [('remove', 10), ('getRandom', 7), ('insert', 10), ('remove', 10), ('getRandom', 0)]
Expected Output: [False, None, True, True, None]
Explanation: Edge case: removing from and sampling from an empty multiset should be handled safely.
Input: [('insert', 5), ('getRandom', 3), ('insert', 5), ('remove', 7), ('remove', 5), ('getRandom', 1)]
Expected Output: [True, 5, False, False, True, 5]
Explanation: Removing a value that does not exist returns False. One 5 remains after removing one occurrence.
Input: [('insert', 0), ('insert', -1), ('insert', 0), ('remove', 0), ('getRandom', 2), ('remove', 0), ('getRandom', 4)]
Expected Output: [True, True, False, True, 0, True, -1]
Explanation: Includes duplicates, zero, and a negative value.
Hints
- Keep all active occurrences in a dynamic array so getRandom can access an element by index in O(1).
- Assign each inserted occurrence a unique ID. A per-value stack of IDs plus lazy cleanup lets you remove a duplicate occurrence without scanning.
Part 2: Randomized Multiset with TTL Expiration
Constraints
- 1 <= len(operations) <= 200000
- Timestamps are non-decreasing
- -10^9 <= val <= 10^9
- 0 <= ttl <= 10^9
- 0 <= r <= 10^18
- Expired occurrences must be excluded correctly before every operation
Examples
Input: [('insert', 1, 5, 5), ('insert', 2, 5, 2), ('getRandom', 3, 1), ('getRandom', 4, 0), ('remove', 5, 5), ('getRandom', 5, 0)]
Expected Output: [True, False, 5, 5, True, None]
Explanation: The second 5 expires at time 4, so only one 5 remains active when time reaches 4. After removing the remaining 5 at time 5, the multiset becomes empty.
Input: [('insert', 0, 1, 10), ('insert', 1, 2, 10), ('insert', 2, 1, 10), ('remove', 3, 1), ('getRandom', 3, 1), ('getRandom', 10, 0), ('getRandom', 11, 0)]
Expected Output: [True, True, False, True, 2, 2, None]
Explanation: The manual remove at time 3 deletes the most recently inserted active 1. Later operations show automatic expiration at times 10 and 11.
Input: [('insert', 5, 7, 0), ('getRandom', 5, 0), ('insert', 5, 7, 1), ('getRandom', 5, 3), ('getRandom', 6, 0), ('remove', 6, 7)]
Expected Output: [True, None, True, 7, None, False]
Explanation: Edge case: TTL = 0 means the inserted occurrence is already expired for the next operation at the same timestamp.
Input: [('insert', 1, -1, 2), ('insert', 2, 4, 5), ('getRandom', 2, 5), ('remove', 3, -1), ('getRandom', 6, 0), ('remove', 7, 4)]
Expected Output: [True, True, 4, False, 4, False]
Explanation: The value -1 expires exactly at time 3 before the remove is processed. The value 4 expires exactly at time 7 before its remove.
Hints
- Use a min-heap ordered by expiration time so you can discard expired items without scanning the whole structure.
- Give each inserted occurrence a unique ID. You can lazily leave stale IDs inside per-value stacks and skip them only when they reach the top.