Implement a rate limiter at scale
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
## Problem
Design and implement a **rate limiter** that enforces request limits per key (e.g., per user ID or API key).
### Requirements
- Implement `allow(key, timestamp)` → returns `true` if the request is allowed, else `false`.
- Support a policy like: **at most N requests per sliding window of W seconds** (you may choose fixed-window vs sliding-window, but state it clearly).
- Handle many distinct keys.
### Follow-up (scale)
How would you adapt the design/implementation if the service must handle **100K QPS**?
Discuss:
- Data structures and time/space complexity
- Concurrency/thread safety
- Distributed deployment (multiple instances) and correctness trade-offs
- Hot-key mitigation and storage choices
### Constraints (assumptions you may make explicit)
- Timestamps are in milliseconds or seconds (choose one).
- Keys are strings.
- Memory is limited; old state should expire.
Quick Answer: This question evaluates understanding of algorithmic and system-level design for per-key rate limiting, including data structures, time/space complexity, concurrency, hot-key mitigation, and distributed deployment in the Coding & Algorithms domain.
Simulate allow(key,timestamp) for a per-key sliding window with at most limit requests in the previous window seconds.
Constraints
- The active window is (timestamp-window, timestamp].
- Requests are processed in nondecreasing timestamp order per key.
Examples
Input: (2, 10, [["u",1],["u",2],["u",3],["u",12]])
Expected Output: [True, True, False, True]
Explanation: Only two requests per sliding window are allowed.
Input: (1, 5, [["a",1],["b",1],["a",6]])
Expected Output: [True, True, True]
Explanation: Keys have independent state and old timestamps expire.
Hints
- Store recent timestamps in a deque per key.
- Drop timestamps outside the current sliding window before deciding.