Thread-Safe Token-Bucket Rate Limiter
Company: Uber
Role: Software Engineer
Category: Coding & Algorithms
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to design thread-safe concurrent data structures, focusing on the token bucket algorithm for rate limiting. It tests understanding of atomicity, lock contention, and time-based state management, commonly asked to assess practical concurrency and systems programming skills at the coding-interview level.
Constraints
- capacity >= 1
- refill_rate_per_second > 0
- Each operation is [timestamp_seconds, permits] with permits >= 1
- Timestamps are non-decreasing (a monotonic clock)
- Available tokens are conceptually a real number in [0, capacity]; a request succeeds only when available_tokens >= permits
- The bucket starts full (capacity tokens) at the first operation's timestamp
Examples
Input: (5, 1, [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [2, 1], [2, 1], [2, 1]])
Expected Output: [True, True, True, True, True, False, True, True, False]
Explanation: The worked example: capacity 5, refill 1/s. Five single-token requests at t=0 drain the full bucket, the sixth is denied. ~2s later, min(5, 0 + 2*1) = 2 tokens have refilled, so two succeed and the third is denied.
Input: (10, 2, [[0, 5], [0, 5], [0, 1], [1, 2], [1, 1]])
Expected Output: [True, True, False, True, False]
Explanation: Multi-permit bursts: two 5-permit requests empty the bucket of 10, the 1-permit request is denied. At t=1, 1*2 = 2 tokens refill, so the 2-permit request succeeds (bucket back to 0) and the following 1-permit request is denied.
Input: (3, 1, [[0, 3], [100, 1], [100, 3]])
Expected Output: [True, True, False]
Explanation: Refill is capped at capacity: after draining 3 tokens at t=0, a 100-second idle gap would add 100 tokens but the bucket clamps to 3. The 1-permit request succeeds (leaving 2), and the 3-permit request is denied because only 2 remain.
Input: (1, 1, [[0, 1], [0, 1], [1, 1]])
Expected Output: [True, False, True]
Explanation: Capacity of 1: the first request takes the only token, the second (same instant) is denied, and after 1 second exactly 1 token refills so the third succeeds.
Input: (5, 1, [])
Expected Output: []
Explanation: Edge case: no operations means no results.
Input: (5, 2, [[0.0, 5], [0.5, 1], [0.5, 1]])
Expected Output: [True, True, False]
Explanation: Fractional-second refill: after draining 5 tokens at t=0, half a second at 2 tokens/s refills exactly 1 token, so one request succeeds and the next is denied.
Input: (2, 5, [[0, 2], [10, 2], [10, 1]])
Expected Output: [True, True, False]
Explanation: Refill rate exceeds capacity: draining 2 tokens then idling 10s would add 50 tokens but the bucket clamps to its capacity of 2. The 2-permit request succeeds (bucket to 0) and the trailing 1-permit request is denied.
Hints
- Keep two pieces of state: the current token count (a float) and the timestamp of the last refill. On each operation compute elapsed = t - last_time.
- Refill lazily: tokens += elapsed * refill_rate_per_second, then clamp to capacity so the bucket never overflows. Update last_time = t.
- Consume atomically: only subtract `permits` (and return True) when tokens >= permits; otherwise leave tokens untouched and return False. In a real multithreaded implementation this check-then-consume must be guarded by a lock or a CAS loop so two threads can't spend the same token.