System Design: HTTP Aggregator With Deadlines, Resilience, and Observability
Context
Build an HTTP aggregator that fans out to three independent downstream services in parallel and returns a single consolidated JSON response. Assume the downstreams are:
-
Service A: User Profile
-
Service B: Recent Orders
-
Service C: Recommendations
The aggregator must be production-grade with strong reliability, performance, and observability guarantees.
Requirements
-
API design
-
Define the aggregator's request schema, response schema, and HTTP status codes.
-
Show how missing/erroneous sub-responses are represented in the final JSON.
-
Concurrency and timeboxing
-
Call A, B, C in parallel.
-
Define per-call timeouts and a global deadline (e.g., 300 ms) so a slow service does not block the whole request.
-
If the global deadline is exceeded, cancel in-flight work and return a degraded but well-formed response.
-
Resilience
-
Handle errors and partial failures with retries (exponential backoff + jitter), circuit breaking, and sensible fallbacks/defaults.
-
Ensure idempotency and avoid duplicating side effects on retries.
-
Merging logic
-
Describe how to merge the payloads (A=user profile, B=recent orders, C=recommendations) into one response.
-
Resource management and isolation
-
Discuss concurrency model, thread safety, resource limits (connection pools), rate limiting, and bulkheading.
-
Observability
-
Outline logging, metrics, and distributed tracing (including correlation IDs) for end-to-end visibility.
-
Testing
-
Describe the testing plan (unit, integration), including timeouts, cancellations, retries, partial failures, and circuit breaking.
-
Code and structure
-
Provide production-grade naming and code structure (modules/classes).
-
Include pseudocode or code in a language of your choice implementing the handler and fan-out/fan-in with cancellation and retries.