Design Efficient Data Structure for Median Retrieval
Company: Amazon
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of streaming data structures, median maintenance, and algorithmic complexity in the Coding & Algorithms domain by requiring efficient insertions (O(log n)) and constant-time median queries (O(1)).
Constraints
- 1 <= number of operations <= 5 * 10^4
- -10^5 <= num <= 10^5 for each addNum
- findMedian on an empty structure returns null
- addNum must be O(log n); findMedian must be O(1)
Examples
Input: ([["addNum", 1], ["addNum", 2], ["findMedian"], ["addNum", 3], ["findMedian"]],)
Expected Output: [1.5, 2.0]
Explanation: After adding 1 and 2 the two middle values are 1 and 2, so the median is (1+2)/2 = 1.5. After adding 3 the values are [1,2,3] and the single middle value is 2.0.
Input: ([["addNum", 5], ["findMedian"], ["addNum", 5], ["findMedian"], ["addNum", 5], ["findMedian"]],)
Expected Output: [5.0, 5.0, 5.0]
Explanation: All inserted values are equal, so the median is 5.0 regardless of count, covering the duplicate-values case.
Input: ([["findMedian"]],)
Expected Output: [null]
Explanation: Edge case: querying before any number is added returns null because the structure is empty.
Input: ([["addNum", -10], ["addNum", -20], ["addNum", -30], ["findMedian"]],)
Expected Output: [-20.0]
Explanation: Negative values [-10,-20,-30] sorted are [-30,-20,-10]; the single middle value is -20.0, verifying correct handling of negatives.
Input: ([["addNum", 4], ["findMedian"], ["addNum", 1], ["addNum", 7], ["addNum", 2], ["findMedian"], ["addNum", 9], ["findMedian"]],)
Expected Output: [4.0, 3.0, 4.0]
Explanation: Median of [4] is 4.0; of [1,2,4,7] (sorted) is (2+4)/2 = 3.0; of [1,2,4,7,9] is the middle value 4.0 — an interleaved insert/query sequence.
Hints
- Maintain two heaps: a max-heap for the smaller half of the numbers and a min-heap for the larger half.
- After each insert, rebalance so the heaps' sizes differ by at most one — push the offending root across when one heap grows too large.
- If the heaps are equal in size, the median is the average of the two roots; otherwise it is the root of the larger heap. Both are O(1) reads.