PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/System Design/TikTok

Design overload protection with load shedding

Last updated: Mar 29, 2026

Quick Overview

This question evaluates system design competencies for overload protection and resilience, covering admission control and rate limiting, queueing and prioritization, load shedding strategies, circuit breakers and backpressure, dependency isolation, and telemetry for SLO validation.

  • hard
  • TikTok
  • System Design
  • Software Engineer

Design overload protection with load shedding

Company: TikTok

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design a high-traffic service that maintains p99 latency SLOs under sudden spikes. Describe admission control, token-bucket rate limiting, priority queues, deadlines, and timeouts. Compare load shedding strategies (drop-new, drop-tail, random, deadline-aware) and when to apply each at the load balancer versus the application. Explain circuit breakers, backpressure to clients and queues, protecting critical dependencies, and the metrics/alerts you would track to validate effectiveness.

Quick Answer: This question evaluates system design competencies for overload protection and resilience, covering admission control and rate limiting, queueing and prioritization, load shedding strategies, circuit breakers and backpressure, dependency isolation, and telemetry for SLO validation.

Related Interview Questions

  • Choose tools for scalable distributed systems - TikTok (medium)
  • Design a distributed key-value store - TikTok (medium)
  • Design a content moderation system - TikTok (medium)
  • Design low-latency large-scale hotel booking system - TikTok (medium)
  • Explain SRE architecture and troubleshooting scenarios - TikTok (hard)
TikTok logo
TikTok
Jul 15, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
2
0

Design: Maintain p99 Latency SLOs During Sudden Traffic Spikes

Context

You are designing a user-facing, read-heavy HTTP/gRPC service that occasionally experiences sudden traffic spikes (for example, due to push notifications or viral content). The service must maintain a p99 latency SLO (e.g., 200 ms) and degrade gracefully under overload.

Assume a typical architecture: Clients → L7 load balancer/reverse proxy → stateless application instances → critical dependencies (cache, DB, search, feature store). Autoscaling cannot fully mask second-level spikes, so the service must protect itself.

Tasks

  1. Admission Control and Rate Limiting
    • Describe how you would implement admission control at the load balancer and within the application.
    • Include token-bucket rate limiting (global and per-tenant/key), concurrency limits, and burst handling.
  2. Queueing, Priorities, Deadlines, and Timeouts
    • Design request queues with priority classes (e.g., P0 interactive, P1 best-effort) and small, bounded backlogs.
    • Explain how you propagate and enforce request deadlines and set timeouts to meet the p99 SLO.
  3. Load Shedding Strategies
    • Compare and contrast: drop-new, drop-tail, random (e.g., RED), and deadline-aware shedding.
    • Explain when each strategy is preferable, and where to apply it (load balancer vs. application).
  4. Circuit Breakers and Backpressure
    • Explain circuit breakers for service-to-service calls (open/half-open/closed, trip conditions, fallbacks).
    • Describe how you provide backpressure to clients (HTTP/gRPC) and to internal queues.
  5. Protecting Critical Dependencies
    • Show how you isolate and protect caches/DBs/search under overload (bulkheads, quotas, fallbacks, precomputed or cached responses).
  6. Metrics and Alerts
    • Specify the metrics, SLI/SLOs, and alerting you would use to validate the effectiveness of your design under spikes.
  7. Include brief numeric examples where helpful (e.g., choosing token-bucket parameters, concurrency caps, and queue budgets) and call out key trade-offs and pitfalls (retry storms, head-of-line blocking, etc.).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More TikTok•More Software Engineer•TikTok Software Engineer•TikTok System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.