Design a production-grade rate limiter for notification sending.
The system receives requests for a user_id and must decide whether the notification should be sent. Each user belongs to one team, and each team belongs to one company. The service must enforce exact rolling 10-minute limits at three levels:
-
max
3
accepted notifications per user
-
max
10
accepted notifications per team
-
max
20
accepted notifications per company
Discuss:
-
the API and request flow
-
how to store and query rate-limit state
-
how to perform an atomic decision across user, team, and company scopes
-
how to scale to many stateless application servers
-
how to handle cache misses and changes in the user/team/company hierarchy
-
hot keys, failures, clock skew, idempotency, and observability
Explain the trade-offs in your design.