PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/DoorDash

Build an API aggregator with concurrency and retries

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in building resilient API aggregation services, including concurrent parallel calls with futures/promises, per-call and overall timeouts, configurable failure policies (WAIT_ALL vs FAIL_FAST), retry mechanics with capped exponential backoff and jitter, partial-failure handling, and observability via structured logs and metrics. It is commonly asked in system design interviews to probe practical implementation of concurrency and resilience patterns, trade-offs in failure policies and timeouts, and the ability to define clear interfaces and monitoring; category: System Design; level: practical application with architectural and conceptual considerations.

  • hard
  • DoorDash
  • System Design
  • Software Engineer

Build an API aggregator with concurrency and retries

Company: DoorDash

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Build a service that exposes one endpoint which calls three external HTTP APIs in parallel, aggregates their responses, and returns a combined JSON result. Requirements: per-call timeouts and an overall request timeout; concurrency using futures/promises; a policy to wait for all vs fail-fast; retries with capped exponential backoff and jitter via a reusable RetryTemplate accepting a Callable; partial-failure handling and default values; structured logging, metrics, and clear code organization. Provide interface definitions, concurrency flow, and sample error-handling logic.

Quick Answer: This question evaluates skills in building resilient API aggregation services, including concurrent parallel calls with futures/promises, per-call and overall timeouts, configurable failure policies (WAIT_ALL vs FAIL_FAST), retry mechanics with capped exponential backoff and jitter, partial-failure handling, and observability via structured logs and metrics. It is commonly asked in system design interviews to probe practical implementation of concurrency and resilience patterns, trade-offs in failure policies and timeouts, and the ability to define clear interfaces and monitoring; category: System Design; level: practical application with architectural and conceptual considerations.

Related Interview Questions

  • Design a Food Rating System - DoorDash (medium)
  • Design a resilient bootstrap API - DoorDash (medium)
  • Design Real-Time Driver Pay Aggregation - DoorDash (hard)
  • Design Food Ratings and Driver Payouts - DoorDash (medium)
  • Design personalized restaurant search and recommendations - DoorDash (medium)
DoorDash logo
DoorDash
Jul 15, 2025, 12:00 AM
Software Engineer
Onsite
System Design
12
0

Build an Aggregation Service with Parallel Calls, Timeouts, Retries, and Observability

Context

You are designing a backend service that exposes a single HTTP endpoint. When called, the endpoint must call three external HTTP APIs in parallel, aggregate their responses, and return a combined JSON result. The service must be robust to timeouts, failures, and include proper retries, observability, and clear code organization.

Assume a typed language with futures/promises support (e.g., Java with CompletableFuture). You may choose reasonable defaults and make minimal assumptions if needed.

Requirements

  1. Endpoint
    • Expose one endpoint (e.g., GET /aggregate) that returns a combined JSON response from three upstream services: A, B, and C.
  2. Concurrency
    • Call the three upstream HTTP APIs in parallel using futures/promises.
  3. Timeouts
    • Per-call timeout for each upstream request.
    • Overall request timeout (deadline) for the whole aggregation request.
  4. Policy
    • Configurable policy to determine behavior:
      • WAIT_ALL: wait for all upstreams, return partial data with defaults if some fail.
      • FAIL_FAST: fail the overall request as soon as any upstream fails or times out.
  5. Retries
    • Implement retries with capped exponential backoff and jitter via a reusable RetryTemplate that accepts a Callable.
  6. Partial Failure Handling
    • When some upstreams fail, return partial data along with default values and error details.
  7. Observability
    • Structured logging with correlation IDs.
    • Metrics (latency, success/fail counts, timeouts, retries).
  8. Deliverables
    • Interface definitions for clients, retry template, and service layer.
    • Concurrency flow description.
    • Sample error-handling logic and example responses.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More DoorDash•More Software Engineer•DoorDash Software Engineer•DoorDash System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.