Design a high-throughput distributed rate limiter
Company: Pinterest
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Onsite
Design a high-throughput distributed rate limiting service that supports per-user and global limits with burst tolerance and an approximately sliding window. Target 10M requests/second at peak across multiple regions. Specify the API, choice of algorithms (token bucket, leaky bucket, sliding window), data model, sharding and hot-key mitigation (e.g., consistent hashing, key splitting), storage choices (in-memory vs. Redis vs. custom), replication, and time coordination. Explain how you enforce limits on the request path, handle failures and partial outages, ensure fairness, and provide eventual or strong consistency where needed. Show how you would scale out: capacity planning formulas, how many machines to add for a 2× traffic spike, autoscaling signals, and backpressure/throttling strategies.
Quick Answer: This question evaluates distributed systems architecture, rate-limiting algorithms, data modeling, sharding and hot-key mitigation, consistency and fault-tolerance strategies, API design, and operational capacity planning within the System Design domain.