PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Anthropic

Design a prompt processing backend

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design a multi-tenant, reliable, and scalable backend for asynchronous LLM prompt processing, covering API design, job orchestration, model routing, prompt versioning, idempotency, retries/DLQ, result storage, observability, and non-functional concerns like cost control, security, and SLOs.

  • hard
  • Anthropic
  • ML System Design
  • Software Engineer

Design a prompt processing backend

Company: Anthropic

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Onsite

Design a background processing backend for large-language-model prompts. Clients submit prompts via an API and later poll or receive callbacks with results. Specify APIs, job queueing and prioritization, worker pools, model routing, prompt versioning, idempotency keys, retries and dead-letter queues, result storage, and observability. Address scaling, cost control, rate limiting, PII/security, and SLAs. Follow-up: support streaming partial outputs and cancellation.

Quick Answer: This question evaluates a candidate's ability to design a multi-tenant, reliable, and scalable backend for asynchronous LLM prompt processing, covering API design, job orchestration, model routing, prompt versioning, idempotency, retries/DLQ, result storage, observability, and non-functional concerns like cost control, security, and SLOs.

Related Interview Questions

  • Design GPU inference request batching - Anthropic
  • How do you handle an LLM agents interview? - Anthropic (hard)
  • Design a prompt playground - Anthropic (medium)
  • Design a model downloader - Anthropic (medium)
  • Design a high-concurrency LLM inference service - Anthropic (hard)
Anthropic logo
Anthropic
Jul 26, 2025, 12:00 AM
Software Engineer
Onsite
ML System Design
9
0

System Design: Background Processing Backend for LLM Prompts

Context

Design a multi-tenant backend that processes large language model (LLM) prompts asynchronously. Clients submit prompts via an API and later poll for status/results or receive callbacks via webhooks. The system must support reliability, scale, and cost controls.

Requirements

  1. APIs
    • Submit prompts (with idempotency keys), poll job status, fetch results, register webhooks/callbacks.
  2. Job orchestration
    • Queueing, prioritization (e.g., realtime vs bulk), worker pools, retries, dead-letter queues (DLQ).
  3. Model routing
    • Route requests to appropriate model/provider based on policy (latency/cost/quality/capacity).
  4. Prompt versioning
    • Manage template versions and the exact prompt/model context used for reproducibility.
  5. Idempotency
    • Ensure duplicate submissions do not create duplicate work/charges.
  6. Retries and DLQ
    • Automatic retry with backoff; poison message handling.
  7. Result storage
    • Store inputs/outputs/metadata, enable polling and callback delivery; set retention policies.
  8. Observability
    • Metrics, logs, traces; per-tenant dashboards, alerting, audits.
  9. Non-functionals
    • Scaling and capacity planning, cost control, rate limiting, PII/security, and SLAs/SLOs.
  10. Follow-up
  • Support streaming partial outputs and cancellation of in-flight jobs.

Describe the architecture, data flows, and key design choices. Provide concrete API designs and operational policies.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Anthropic•More Software Engineer•Anthropic Software Engineer•Anthropic ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.