PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/System Design/TikTok

Improve and measure service performance

Last updated: Mar 29, 2026

Quick Overview

This question evaluates performance engineering and observability skills—specifically SLIs/SLOs definition, instrumentation and profiling, load-testing, bottleneck identification, and optimization trade-offs for stateless HTTP microservices.

  • hard
  • TikTok
  • System Design
  • Software Engineer

Improve and measure service performance

Company: TikTok

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

How would you assess and improve a service’s performance under high load? Define the key SLIs/SLOs you would set (e.g., latency percentiles, throughput, error rate), describe how you would instrument and profile the system, outline a load-testing plan, identify likely bottlenecks (CPU, I/O, database, cache, network), and propose concrete optimizations with trade-offs.

Quick Answer: This question evaluates performance engineering and observability skills—specifically SLIs/SLOs definition, instrumentation and profiling, load-testing, bottleneck identification, and optimization trade-offs for stateless HTTP microservices.

Related Interview Questions

  • Choose tools for scalable distributed systems - TikTok (medium)
  • Design a distributed key-value store - TikTok (medium)
  • Design a content moderation system - TikTok (medium)
  • Design low-latency large-scale hotel booking system - TikTok (medium)
  • Explain SRE architecture and troubleshooting scenarios - TikTok (hard)
TikTok logo
TikTok
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
2
0

Assessing and Improving a Service’s Performance Under High Load

Context

You are responsible for a user-facing, stateless HTTP microservice that handles read-heavy traffic with bursty patterns (e.g., push-notification spikes). The service depends on a cache (e.g., Redis) and a database (e.g., MySQL/Postgres), is deployed behind a load balancer, and auto-scales.

Define how you would assess and improve this service’s performance under high load. Address the following:

  1. SLIs/SLOs
  • Define the key Service Level Indicators (SLIs) and Service Level Objectives (SLOs), including latency percentiles, throughput, error rate, availability, and saturation.
  1. Instrumentation and Profiling
  • Describe how you would instrument the service (metrics, traces, logs) and how you would profile it (CPU, memory, locks, I/O) to find hotspots.
  1. Load-Testing Plan
  • Outline a realistic load-testing plan: workload modeling, test types (ramp, spike, soak, stress), environment setup, data representativeness, and pass/fail criteria.
  1. Likely Bottlenecks
  • Identify common bottlenecks under high load (CPU, I/O, database, cache, network, GC, locks, pools) and symptoms you’d look for.
  1. Concrete Optimizations
  • Propose specific optimizations and discuss trade-offs (e.g., caching strategies, query/index tuning, batching, backpressure, retries, timeouts, concurrency control, network/protocol tuning, autoscaling).

Make any minimal assumptions you need and be explicit about them.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More TikTok•More Software Engineer•TikTok Software Engineer•TikTok System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.