This question evaluates skills in distributed systems design, concurrency control, rate-limiting algorithms (token bucket and sliding window), distributed coordination and consistency, API design, and operational monitoring for scalability and fault tolerance.
You are designing a rate limiter for an API gateway that serves high QPS traffic. The limiter should cap requests to a configured rate (QPS) with an optional burst allowance, operate correctly under concurrency, and scale across multiple application instances.
Provide code-like pseudocode where helpful.
Login required