Design aggregator for multiple downstream services
Company: Walmart Labs
Role: Software Engineer
Category: System Design
Difficulty: easy
Interview Round: Onsite
Design a backend service for a large retailer (similar to Walmart) that must call **many downstream services** to fulfill a single user request.
When a client calls this "aggregator" service:
- It must fan out requests to several downstream services (for example: pricing, inventory, user profile, recommendations, shipping options, promotions).
- It needs to **wait for responses from all required downstream services** before sending a combined response back to the client.
Assume a microservices environment, potentially implemented with a framework like Spring Boot.
Discuss and design the system. In your answer, cover:
1. **High-level architecture**
- How the client interacts with this aggregator service.
- How the aggregator interacts with multiple downstream services.
- Any supporting components (API gateway, cache, message broker, etc.).
2. **Request handling and concurrency**
- How the aggregator should make calls to downstream services (synchronously vs. asynchronously; parallel vs. sequential).
- How to aggregate all responses efficiently.
- How to handle timeouts and retries.
3. **Failure handling and resilience**
- What happens if one or more downstream services are slow, fail, or return errors.
- How to avoid the aggregator becoming a bottleneck or single point of failure.
- Use of patterns such as circuit breakers, bulkheads, and fallbacks.
4. **Performance and scalability**
- How to ensure low latency for end users, even when many downstream calls are involved.
- How to scale the aggregator service as traffic grows.
- Possible use of caching and batching.
5. **API and data model considerations**
- How you would structure the aggregator’s API and response schema.
- How to version the API as downstream services evolve.
6. **Implementation considerations in Spring Boot (or similar)**
- How you might use async/non-blocking patterns (e.g., thread pools, reactive programming) to implement fan-out/fan-in.
- How you would configure timeouts, retries, and observability (logging, metrics, tracing).
Provide a detailed, step-by-step design, calling out the key trade-offs and alternatives you consider.
Quick Answer: This question evaluates system-design and distributed-systems skills, focusing on microservice aggregation, fan-out/fan-in concurrency, fault tolerance, request orchestration, API and data-model design, and performance and scalability trade-offs.