Implement a CSAT Sliding Window Tracker
Company: Decagon
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to design and implement efficient data structures and algorithms for a sliding time-window tracker, covering time-based expiration, stateful updates, rolling averages with rounding, and nearest-rank percentile computation.
Constraints
- `0 <= window_size <= 10^9`
- `0 <= len(operations) <= 2 * 10^5`
- `1 <= score <= 5`
- Each `conversation_id` appears in at most one `record` operation
- `1 <= p <= 100` for percentile queries
- All operations with a time argument (`record`, `get_average`, `get_percentile`) are given in non-decreasing time order; `record` timestamps are strictly increasing
Examples
Input: (5, [("record", "c1", 5, 1), ("record", "c2", 3, 2), ("get_average", 2), ("get_percentile", 2, 50), ("update", "c1", 1), ("get_average", 2), ("get_percentile", 2, 50), ("get_average", 8), ("update", "c2", 5), ("get_percentile", 8, 90)])
Expected Output: [4.0, 3, 2.0, 1, 0.0, None]
Explanation: At time 2, active scores are [5, 3], so average is 4.0 and the 50th percentile is 3. After updating c1 to 1, active scores are [1, 3]. By time 8 both conversations have expired, so the average is 0.0 and percentile is None. The update to c2 at time 8 does nothing because it is expired.
Input: (0, [("record", "a", 4, 10), ("get_average", 10), ("get_percentile", 10, 100), ("get_average", 11), ("update", "a", 1), ("get_percentile", 11, 1)])
Expected Output: [4.0, 4, 0.0, None]
Explanation: With a window size of 0, a conversation is active only at its exact timestamp. So score 4 is active at time 10, but expired by time 11. The update at time 11 has no effect.
Input: (3, [("record", "x", 2, 1), ("record", "y", 2, 2), ("record", "z", 5, 4), ("get_percentile", 4, 75), ("update", "y", 4), ("get_average", 4), ("get_percentile", 4, 75), ("get_average", 6)])
Expected Output: [5, 3.67, 5, 5.0]
Explanation: At time 4, active scores are [2, 2, 5], so the 75th percentile is 5. After updating y to 4, the scores become [2, 4, 5], giving average 3.67. By time 6, only z remains active, so the average is 5.0.
Input: (4, [("get_average", 1), ("get_percentile", 1, 30)])
Expected Output: [0.0, None]
Explanation: No conversations were recorded, so the average is 0.0 and the percentile query returns None.
Hints
- Because time only moves forward, you can keep conversations in a queue and expire old ones from the front.
- Scores only range from 1 to 5, so percentile queries can be answered from score frequencies instead of sorting every time.