Design rolling-window conversation score tracker
Company: Decagon
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
Quick Answer: This question evaluates a candidate's ability to design efficient data structures and algorithms for time-windowed streaming data, including maintaining rolling aggregates, supporting in-window updates, and computing order-statistics such as top-percentile scores.
Constraints
- 1 <= W <= 10^9
- 0 <= len(operations) <= 2 * 10^5
- conversation_id is a unique string for every record operation
- record timestamps are non-decreasing, and all operation time arguments are non-decreasing across the full operations list
- 1 <= score, new_score <= 5
- 0 <= p <= 100
- 1 <= n <= 2 * 10^5
Examples
Input: (10, [['record', 'a', 1, 5], ['record', 'b', 3, 3], ['get_average', 5], ['get_top_percentile_nth_score', 50, 1, 5], ['update', 'b', 4, 5], ['get_average', 5], ['get_top_percentile_nth_score', 100, 2, 5]])
Expected Output: [4.0, 5, True, 4.5, 4]
Explanation: Initially scores 5 and 3 are active, so the average is 4.0 and the top 50% contains only score 5. Updating b from 3 to 4 succeeds, making the average 4.5 and the second-highest score 4.
Input: (5, [['record', 'a', 10, 2], ['record', 'b', 12, 5], ['get_average', 15], ['update', 'a', 4, 15], ['get_top_percentile_nth_score', 100, 1, 15], ['get_average', 18], ['get_top_percentile_nth_score', 100, 1, 18]])
Expected Output: [5.0, False, 5, 0.0, None]
Explanation: At now=15, the active window is (10, 15], so conversation a at timestamp 10 is expired and b remains active. By now=18, b is also expired.
Input: (100, [['record', 'a', 1, 1], ['record', 'b', 2, 2], ['record', 'c', 3, 3], ['record', 'd', 4, 4], ['record', 'e', 5, 5], ['get_top_percentile_nth_score', 40, 2, 5], ['get_top_percentile_nth_score', 40, 3, 5], ['get_top_percentile_nth_score', 1, 1, 5], ['get_average', 5]])
Expected Output: [4, None, 5, 3.0]
Explanation: There are 5 active conversations. Top 40% means ceil(0.4 * 5) = 2 conversations, whose scores are 5 and 4, so the 2nd score is 4 and the 3rd is unavailable. Top 1% still includes ceil(0.05) = 1 conversation.
Input: (10, [['get_average', 0], ['get_top_percentile_nth_score', 50, 1, 0], ['update', 'missing', 5, 0]])
Expected Output: [0.0, None, False]
Explanation: With no recorded conversations, the average is 0.0, top-percentile query returns None, and updating a missing conversation fails.
Input: (3, [['record', 'a', 1, 1], ['record', 'b', 2, 1], ['update', 'a', 5, 3], ['get_top_percentile_nth_score', 100, 1, 3], ['get_average', 3], ['get_average', 4], ['get_top_percentile_nth_score', 100, 1, 4], ['update', 'b', 4, 5], ['get_average', 5]])
Expected Output: [True, 5, 3.0, 1.0, 1, False, 0.0]
Explanation: Updating a to score 5 succeeds while it is active. At now=4, the window is (1, 4], so a at timestamp 1 has expired but b remains. At now=5, b at timestamp 2 is also expired because the lower boundary is strict.
Hints
- Since time never moves backward, keep recorded conversations in timestamp order and expire old conversations from the front before each operation.
- Scores are only in the range 1 to 5, and the top-percentile query returns only a score. Maintaining counts per score can avoid sorting active conversations.