Design a rate-limiting system for an expensive API.
Requirements
-
Each user (or API key) has a
monthly quota
(e.g., 10,000 calls/month).
-
Users can
manually configure
(update) their monthly quota via a UI/API.
-
If a user exceeds the quota, further requests must be rejected (e.g., HTTP 429) until the next monthly reset.
Scale & reliability
-
Assume high QPS, multiple stateless API servers, and the limiter must work in a distributed environment.
-
Discuss correctness goals (strict vs eventually consistent) and trade-offs.
Deliverables
-
High-level architecture and request flow
-
Data model/storage choices for counters and quota configuration
-
How to handle monthly reset, time zones, retries, and idempotency
-
Operational concerns: monitoring, failure modes, and abuse cases