PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Anthropic

Design a scalable service and model performance

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability in large-scale distributed system design, performance modeling, capacity planning, and operational reliability for latency-sensitive, multi-region key-value services.

  • hard
  • Anthropic
  • System Design
  • Machine Learning Engineer

Design a scalable service and model performance

Company: Anthropic

Role: Machine Learning Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Design a highly available, multi-region service that handles 50k peak QPS with a p95 latency under 100 ms. Specify API design, storage schema, caching strategy, consistency model, data partitioning, failure handling, and rollout/canary strategies. Perform back-of-the-envelope capacity planning: estimate read/write ratios, data size growth over 12 months, peak vs. average load, instance sizing, and network egress. Build a performance model to predict end-to-end latency under load: decompose service time, apply queueing approximations (e.g., Little’s Law), and identify the bottlenecks. Propose concrete mitigations (e.g., batching, async workflows, indexes, autoscaling, circuit breaking) and define SLOs, monitoring, and load-testing plans to validate your model.

Quick Answer: This question evaluates a candidate's ability in large-scale distributed system design, performance modeling, capacity planning, and operational reliability for latency-sensitive, multi-region key-value services.

Related Interview Questions

  • Design a one-to-one chat system - Anthropic (medium)
  • Design One-to-One Chat - Anthropic (medium)
  • How to stream a large file to 1000 hosts fastest - Anthropic (medium)
  • Design guardrails and fallback for LLM reliability - Anthropic (hard)
  • Design a Crash-Resilient LRU Cache - Anthropic (hard)
Anthropic logo
Anthropic
Aug 14, 2025, 12:00 AM
Machine Learning Engineer
Onsite
System Design
7
0

System Design: Multi-Region, 50k QPS, p95 < 100 ms

Context

Design an online, read-heavy key-value service (for example, a user profile or feature lookup) used by latency-sensitive applications worldwide. Clients connect from multiple continents. The service must be highly available across multiple regions and maintain low tail latency.

Assume small payloads (1–5 KB), id-based access patterns, and that strong read-after-write consistency is required within a region for a session, but cross-region consistency can be eventual.

Requirements

  • Traffic target: 50k peak QPS (global), p95 latency under 100 ms.
  • Multi-region, active-active, highly available design with zero single-region dependency.
  • Include APIs, data model, caching, consistency, partitioning, failure handling, rollout/canary.
  • Do back-of-the-envelope capacity planning (reads vs writes, growth, peak vs average, instance sizing, egress).
  • Build a performance model to predict end-to-end latency under load (service time breakdown, queueing approximations such as Little’s Law), and identify bottlenecks.
  • Propose concrete mitigations and define SLOs, monitoring, and load-testing to validate the model.

Deliverables

  1. API design (CRUD, batch, idempotency, versioning, errors).
  2. Storage schema and indexing; partitioning strategy.
  3. Caching layers and invalidation strategy.
  4. Consistency model (regional and cross-region) and conflict resolution.
  5. Failure handling (zone, region, network partitions, thundering herd) and client resiliency.
  6. Rollout and canary strategies (schema and code).
  7. Capacity planning with numerical estimates: read/write ratio, data growth over 12 months, peak vs average, instance count and size, and network egress.
  8. Performance model with queueing approximations; identify bottlenecks.
  9. Mitigations (e.g., batching, async, indexes, autoscaling, circuit breaking).
  10. SLOs, monitoring, and load-testing plans to validate performance and availability.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Anthropic•More Machine Learning Engineer•Anthropic Machine Learning Engineer•Anthropic System Design•Machine Learning Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.