Implement a Distributed Rate Limiter
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates a candidate's ability to design and implement a distributed rate-limiting mechanism, testing knowledge of distributed systems, concurrency control, fault tolerance, state persistence, clock skew handling, degraded-mode behavior, and reconciliation of locally buffered state.
Constraints
- 0 <= limit <= 10^5
- 1 <= windowMs <= 10^9
- 1 <= len(serverOffsets) <= 50
- 0 <= len(operations) <= 2 * 10^5
- Every `serverId` used in `operations` appears in `serverOffsets`
- After converting each `allow` and `reset` event to canonical time (`timestampMs - offsetMs[serverId]`), those canonical times are non-decreasing in the given operation order
Examples
Input: (2, 100, [('s1', 0), ('s2', 10)], [('allow', 's1', 'alice', 100), ('allow', 's2', 'alice', 150), ('allow', 's1', 'alice', 190), ('allow', 's1', 'alice', 201)])
Expected Output: [True, True, False, True]
Explanation: Canonical times are 100, 140, 190, and 201. At 190, both earlier requests are still inside the last 100 ms, so the request is denied. At 201, the request at 100 has expired.
Input: (1, 50, [('s1', 0), ('s2', 0)], [('set_remote', False), ('allow', 's1', 'bob', 10), ('allow', 's2', 'bob', 20), ('set_remote', True), ('allow', 's1', 'bob', 30)])
Expected Output: [True, True, False]
Explanation: During the outage, each server only sees its own local mirror, so both requests are accepted even though the global limit is 1. After reconciliation, the remote store contains both accepted requests, so the request at 30 is denied.
Input: (2, 100, [('s1', 0), ('s2', 0)], [('allow', 's1', 'c', 0), ('set_remote', False), ('reset', 's2', 'c', 30), ('allow', 's2', 'c', 40), ('set_remote', True), ('allow', 's1', 'c', 50), ('allow', 's1', 'c', 60)])
Expected Output: [True, True, True, False]
Explanation: The reset at time 30 clears the earlier request at time 0 during reconciliation. The accepted request at 40 remains, so the request at 50 is allowed and the one at 60 is denied.
Input: (3, 100, [('s1', 5)], [])
Expected Output: []
Explanation: There are no operations, so there are no `allow` decisions to return.
Input: (2, 100, [('s1', 0)], [('set_remote', False), ('allow', 's1', 'x', 10), ('restart', 's1'), ('allow', 's1', 'x', 20), ('set_remote', True), ('allow', 's1', 'x', 30)])
Expected Output: [True, True, False]
Explanation: The restart does not erase the persisted local mirror or journal. Both degraded-mode requests survive the restart, and after reconciliation the later request is denied.
Input: (1, 100, [('s1', 0)], [('allow', 's1', 'edge', 0), ('allow', 's1', 'edge', 100), ('allow', 's1', 'edge', 199)])
Expected Output: [True, True, False]
Explanation: The window is `(t - 100, t]`, so the request at time 0 no longer counts when checking time 100. The request at 199 still sees the request at 100 inside the window, so it is denied.
Hints
- Use a deque per client so you can quickly discard request timestamps that are no longer inside the sliding window.
- Treat degraded mode as journal replay: decisions made locally during an outage are recorded, then merged into the shared store in canonical-time order when the store recovers.