Implement resilient LLM provider pool

Q: Implement resilient LLM provider pool

This is a ML System Design interview question from Perplexity AI for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

System Design Task: Resilient Multi‑Provider LLM Client Library

Context

You are designing a client library used by backend services to call external Large Language Model (LLM) providers (e.g., OpenAI, Anthropic, etc.). The library must route requests across multiple providers to maximize availability, control cost, and meet latency SLOs.

Requirements

Provider management
- Provider registration/unregistration
- Capability mapping: available models, max context length, supported features (streaming, function calling, JSON mode, etc.)
Governance
- Per‑provider rate limiting (including per‑API key where applicable)
- Quotas (per provider, per model, per tenant)
Resilience
- Health checks (active and passive)
- Timeouts (connect, request, total)
- Circuit breakers (fail‑fast, half‑open probes)
Routing
- Cost‑ and latency‑aware load balancing
- Retry and fallback across providers when degraded or down
Observability
- Metrics, structured logs, and distributed tracing
Security
- Secure key management (storage, rotation, scoping, redaction)
Concurrency
- Thread‑safe, high‑throughput, supports parallel requests and streaming

Deliverables

Describe interfaces and data structures
Explain request routing, failure handling, and concurrency strategy
Include assumptions where needed

Implement resilient LLM provider pool

System Design Task: Resilient Multi‑Provider LLM Client Library

Context

Requirements

Deliverables

Solution

Comments (0)