PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Google

Design a distributed rate limiter

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in distributed systems design, concurrency control, rate-limiting algorithms (token bucket and sliding window), distributed coordination and consistency, API design, and operational monitoring for scalability and fault tolerance.

  • hard
  • Google
  • System Design
  • Software Engineer

Design a distributed rate limiter

Company: Google

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Design a rate limiter to ensure API calls do not exceed a given QPS with optional burst allowance. Compare and implement two approaches (e.g., token bucket and sliding window), specifying data structures, time/space complexity, and correctness under concurrent access. Extend your design to a distributed setting across multiple application instances: discuss coordination (centralized store like Redis vs. sharded counters), clock skew tolerance, atomicity, idempotency, failure modes (node loss, partial updates), and strategies to maintain consistency and fairness. Provide APIs, example configurations, and monitoring/alerting for saturation.

Quick Answer: This question evaluates skills in distributed systems design, concurrency control, rate-limiting algorithms (token bucket and sliding window), distributed coordination and consistency, API design, and operational monitoring for scalability and fault tolerance.

Related Interview Questions

  • Design an Online Coding Judge Platform - Google (medium)
  • Design a pub-sub replay system - Google (hard)
  • How to host many domains on one IP? - Google (medium)
  • Design street-view image ingestion and storage system - Google (hard)
  • Design a global real-time notification system - Google (medium)
Google logo
Google
Sep 6, 2025, 12:00 AM
Software Engineer
Onsite
System Design
6
0

Design a Rate Limiter with Burst Allowance and Distributed Coordination

Context

You are designing a rate limiter for an API gateway that serves high QPS traffic. The limiter should cap requests to a configured rate (QPS) with an optional burst allowance, operate correctly under concurrency, and scale across multiple application instances.

Requirements

  1. Single-instance design:
    • Implement and compare two approaches: token bucket and sliding window.
    • Specify data structures, algorithms, and precise behavior (including burst allowance).
    • Provide time/space complexity.
    • Ensure correctness under concurrent access.
  2. Distributed design across multiple app instances:
    • Coordination strategies: centralized store (e.g., Redis) vs. sharded counters.
    • Address clock skew tolerance, atomicity, idempotency.
    • Discuss failure modes (node loss, partial updates) and strategies for consistency and fairness.
  3. Developer-facing API:
    • Define clear APIs (e.g., Allow, Acquire) and example configurations.
  4. Operations:
    • Monitoring/alerting for saturation and error conditions.

Provide code-like pseudocode where helpful.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Google•More Software Engineer•Google Software Engineer•Google System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.