How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a medium difficulty System Design question, commonly asked during Technical Screen rounds at Databricks.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Databricks during technical interviews.

Design a Book Price Aggregator | Databricks Interview Question

Q: Design a Book Price Aggregator

This question evaluates skills in distributed systems design, fault tolerance, scalability, integration with external services, transactional consistency across asynchronous operations, and reliability patterns such as caching, request coalescing, and failure isolation.

Design a book purchasing marketplace where your service acts as an intermediary between customers and hundreds of partner bookstores. Your service holds no inventory of its own; for every request it queries the partner stores live, places the order on whichever store wins, and runs the customer's payment itself.

A customer submits a single purchase request containing:

An ISBN (the book to buy)
A bid price — the maximum price the customer is willing to pay
A payment method

On each request, your system must fan out asynchronous requests to hundreds of partner bookstores to discover which stores have the book in stock and at what price, then apply the following rules:

If at least one bookstore has inventory and the lowest available price is $\le$ the customer's bid price , place an order with that bookstore (and charge the customer).
If bookstores have inventory but the lowest available price is higher than the bid price , return that lowest price to the customer (no order placed).
If no bookstore has inventory , notify the customer that the book is unavailable.

Walk through the architecture end to end, then go deep on the parts below. Treat each part as a distinct design concern; a strong session covers the read/aggregation path, the failure model, the scaling levers, and the money-correctness path with roughly equal rigor.

Constraints & Assumptions

State your own numbers explicitly and design against them; the interviewer cares more about consistent reasoning than any specific figure. Reasonable anchors to assume unless told otherwise:

Scale: hundreds of partner stores (assume ~500); a peak on the order of hundreds-to-thousands of purchase requests/sec. Note that naive fan-out multiplies these — a request rate times a store count of outbound partner calls is the load you must drive down.
Latency: a customer cannot wait on the slowest of 500 stores. Assume a hard global deadline for a synchronous answer (state the target you pick, e.g. on the order of ~1 s) and design to act on partial results.
Workload skew: ISBN popularity is heavily skewed (a small set of bestsellers dominates traffic) — call out whether you exploit this.
Partner reality: individual stores will time out, return 5xx, rate-limit you, or return malformed data; price and stock change over time; some stores support idempotency keys on orders and some do not.
Money: placing an order and charging the customer are two separate external side effects with no shared transaction across your DB, the payment provider, and the bookstore.
Quantity: assume one book per request (qty 1) unless you call out multi-line carts as an extension.

Clarifying Questions to Ask

A strong candidate scopes the problem before drawing boxes. Good questions to raise with the interviewer:

Is the customer answer synchronous (block ~1 s for a result) or asynchronous (accept now, notify later)? How does the API shape change if both must be supported?
Is the quoted "lowest price" a hard contract (we are bound to honor whatever we quote) or best-effort ? This decides how fresh quotes must be and whether price is re-confirmed at order time.
What is the acceptable global latency budget , and is it acceptable to answer from a partial search of stores when not all respond in time?
What are our contractual rate limits / quotas with partner stores (i.e. how hard can we hammer them before we get banned)?
Is it ever required to reserve scarce inventory before payment, or is querying live price/stock at decision time sufficient?
Single book per request, or do we need multi-item carts?

Part 1 — End-to-end architecture & API

Lay out the major components and the request lifecycle from "customer submits a request" to "terminal status returned." Define the request/response contract, including how the customer learns the outcome (ORDER_PLACED / price-too-high / unavailable / pending) and how a client retry is handled.

What This Part Should Cover

Component inventory & lifecycle: the major services named (gateway/intent → quote orchestrator → partner adapters → purchase workflow → notification) and a coherent request flow through them, not a flat box-and-arrow dump.
Two-path separation surfaced at the API: the read/aggregation path and the write/money path appear as distinct concerns in both the components and the contract.
Contract completeness: the request/response carries every terminal status, a re-bid price for the price-too-high case, and a coverage/metadata field so the UI can't overstate search completeness.
Retry semantics: a client-supplied idempotency key collapses a re-POST onto the same intent rather than starting a second workflow; sync-vs-async outcome delivery (poll/notify) is addressed.

Part 2 — Fan-out, deadlines & result aggregation

Describe how you query many unreliable stores in parallel and produce a decision under a hard global deadline. How do per-call timeouts relate to the global deadline? When do you stop collecting responses? How do you aggregate the responses into the order / price-too-high / unavailable decision, and how do you avoid claiming "unavailable" when stores simply didn't answer in time?

What This Part Should Cover

Deadline discipline: per-call timeouts strictly inside the global deadline, with explicit reasoning for why one slow store can never burn the whole budget.
Stop condition: a clear rule for when to stop collecting (deadline / all returned / justified early-stop), and an explicit early-stop-vs-wait tradeoff for "true minimum price."
Aggregation state: a running-min over valid, in-stock quotes plus coverage counters, and the mapping from aggregate outcome to the order / price-too-high / unavailable decision.
Partial-result honesty: "unavailable" qualified as "unavailable among responders," so incomplete coverage is never reported as a definitive no.

Part 3 — Tolerating downstream failures (timeouts, circuit breakers, bulkheads)

With hundreds of partners, several are unhealthy at any moment. Detail your layered defenses: per-call timeouts, when (if ever) to retry, and how circuit breakers and bulkheads keep one bad partner from degrading the whole request or exhausting your resources.

What This Part Should Cover

Layered defenses: per-call timeout, breaker, and bulkhead presented as distinct layers (call → partner → whole-system), not a single mechanism.
Breaker semantics: the closed → open → half-open lifecycle, the signals that trip it (timeout/error/429/latency), and the win of skipping an open partner instantly instead of eating a timeout.
Isolation & backpressure: per-partner pools so a hang is contained, plus adaptive concurrency / graceful degradation when a partner or internal queue saturates.
Disciplined retries: retries confined to idempotent, transient quote reads, bounded and jittered, never blind-retrying into an already-slow store.

Part 4 — Scaling: caching, TTLs, coalescing & thundering-herd protection

The naive fan-out (requests × stores) is not survivable and would get you rate-limited or banned. Explain how you drive that number down. Cover your cache key and TTL policy (including negative results and partner errors), request coalescing, thundering-herd protection on cache expiry, and how you decide which stores to query.

What This Part Should Cover

Quantified problem framing: the requests × stores fan-out stated as a number, then driven down with named levers (the candidate should know which lever dominates).
Cache & TTL policy: an (ISBN, partner_id) key, freshness-vs-load TTL reasoning, and explicit special-casing of negative results and partner-error signals.
Coalescing & herd defenses: single-flight on concurrent same-ISBN requests, plus stale-while-revalidate and TTL jitter to survive hot-key expiry.
Partner selection & quota respect: ranking candidates per ISBN instead of querying all 500, and multi-layer rate limiting that honors partner contractual quotas.

Part 5 — Money correctness: order-vs-charge consistency, crash recovery & duplicate prevention

This is the hardest part. Placing the order (call to the store) and charging (call to the payment provider) are two independent external side effects with no shared transaction. Decide whether to order first or charge first, justify it, and explain how you recover from a mid-flight crash and prevent duplicate charges or duplicate orders.

What This Part Should Cover

Workflow durability: the purchase modeled as a persisted saga / state machine, with each transition written before the next external call so a dead worker resumes from a known state.
Justified ordering: an explicit order-vs-charge decision that makes "money gone, no book" unreachable by construction — typically authorize → order → capture, with charge-first and order-first rejected for stated reasons.
Idempotency & DB invariants: deterministic idempotency keys on each external call plus DB uniqueness / compare-and-swap transitions so retries and concurrent workers can't double-charge or double-order.
Crash recovery & the non-idempotent store: an outbox + reconciliation sweeper, and a concrete plan to discover whether an order landed (lookup by reference id) when the store offers no idempotency key.

What a Strong Answer Covers

These dimensions span all parts — the interviewer is listening for them across the whole session, not inside any single Part:

Requirements & scope: functional rules restated cleanly, plus explicit non-functional targets (latency budget, fault tolerance, money correctness, partner protection) and a rough capacity estimate that frames the fan-out problem.
Clean decomposition: a deliberate split between a latency-bounded, cache-heavy read/aggregation path and a strictly-correct write/money path , sustained consistently across every part.
Tradeoffs named out loud: what strains first, and the freshness-vs-load and latency-vs-completeness dials, with the candidate choosing a point rather than hand-waving.
Observability: the metrics and dashboards you would page on (fan-out size, cache/coalescing hit rate, per-partner health, breaker transitions, % partial results, payment/order success rates, stuck-intent reconciliation).

Follow-up Questions

How does the design change at 100x request volume , or when the partner count grows from hundreds to tens of thousands of stores — what breaks first, and which lever do you pull?
If the quoted price becomes a hard contractual obligation (you must honor whatever you quote), how do caching and the order step change?
A partner store does not support order idempotency keys and your worker crashes after dispatching an order but before recording it — how do you avoid a duplicate order without a guaranteed-exactly-once call?
Suppose inventory for some titles is genuinely scarce and must be reserved before payment — how does the consistency model and the order-vs-charge decision change?

A customer submits a single purchase request containing:

An ISBN (the book to buy)
A bid price — the maximum price the customer is willing to pay
A payment method

If at least one bookstore has inventory and the lowest available price is $\le$ the customer's bid price , place an order with that bookstore (and charge the customer).
If bookstores have inventory but the lowest available price is higher than the bid price , return that lowest price to the customer (no order placed).
If no bookstore has inventory , notify the customer that the book is unavailable.

Constraints & Assumptions

State your own numbers explicitly and design against them; the interviewer cares more about consistent reasoning than any specific figure. Reasonable anchors to assume unless told otherwise:

Scale: hundreds of partner stores (assume ~500); a peak on the order of hundreds-to-thousands of purchase requests/sec. Note that naive fan-out multiplies these — a request rate times a store count of outbound partner calls is the load you must drive down.
Latency: a customer cannot wait on the slowest of 500 stores. Assume a hard global deadline for a synchronous answer (state the target you pick, e.g. on the order of ~1 s) and design to act on partial results.
Workload skew: ISBN popularity is heavily skewed (a small set of bestsellers dominates traffic) — call out whether you exploit this.
Partner reality: individual stores will time out, return 5xx, rate-limit you, or return malformed data; price and stock change over time; some stores support idempotency keys on orders and some do not.
Money: placing an order and charging the customer are two separate external side effects with no shared transaction across your DB, the payment provider, and the bookstore.
Quantity: assume one book per request (qty 1) unless you call out multi-line carts as an extension.

Clarifying Questions to Ask

A strong candidate scopes the problem before drawing boxes. Good questions to raise with the interviewer:

Is the customer answer synchronous (block ~1 s for a result) or asynchronous (accept now, notify later)? How does the API shape change if both must be supported?
Is the quoted "lowest price" a hard contract (we are bound to honor whatever we quote) or best-effort ? This decides how fresh quotes must be and whether price is re-confirmed at order time.
What is the acceptable global latency budget , and is it acceptable to answer from a partial search of stores when not all respond in time?
What are our contractual rate limits / quotas with partner stores (i.e. how hard can we hammer them before we get banned)?
Is it ever required to reserve scarce inventory before payment, or is querying live price/stock at decision time sufficient?
Single book per request, or do we need multi-item carts?

Part 1 — End-to-end architecture & API

What This Part Should Cover

Component inventory & lifecycle: the major services named (gateway/intent → quote orchestrator → partner adapters → purchase workflow → notification) and a coherent request flow through them, not a flat box-and-arrow dump.
Two-path separation surfaced at the API: the read/aggregation path and the write/money path appear as distinct concerns in both the components and the contract.
Contract completeness: the request/response carries every terminal status, a re-bid price for the price-too-high case, and a coverage/metadata field so the UI can't overstate search completeness.
Retry semantics: a client-supplied idempotency key collapses a re-POST onto the same intent rather than starting a second workflow; sync-vs-async outcome delivery (poll/notify) is addressed.

Part 2 — Fan-out, deadlines & result aggregation

What This Part Should Cover

Deadline discipline: per-call timeouts strictly inside the global deadline, with explicit reasoning for why one slow store can never burn the whole budget.
Stop condition: a clear rule for when to stop collecting (deadline / all returned / justified early-stop), and an explicit early-stop-vs-wait tradeoff for "true minimum price."
Aggregation state: a running-min over valid, in-stock quotes plus coverage counters, and the mapping from aggregate outcome to the order / price-too-high / unavailable decision.
Partial-result honesty: "unavailable" qualified as "unavailable among responders," so incomplete coverage is never reported as a definitive no.

Part 3 — Tolerating downstream failures (timeouts, circuit breakers, bulkheads)

What This Part Should Cover

Layered defenses: per-call timeout, breaker, and bulkhead presented as distinct layers (call → partner → whole-system), not a single mechanism.
Breaker semantics: the closed → open → half-open lifecycle, the signals that trip it (timeout/error/429/latency), and the win of skipping an open partner instantly instead of eating a timeout.
Isolation & backpressure: per-partner pools so a hang is contained, plus adaptive concurrency / graceful degradation when a partner or internal queue saturates.
Disciplined retries: retries confined to idempotent, transient quote reads, bounded and jittered, never blind-retrying into an already-slow store.

Part 4 — Scaling: caching, TTLs, coalescing & thundering-herd protection

What This Part Should Cover

Quantified problem framing: the requests × stores fan-out stated as a number, then driven down with named levers (the candidate should know which lever dominates).
Cache & TTL policy: an (ISBN, partner_id) key, freshness-vs-load TTL reasoning, and explicit special-casing of negative results and partner-error signals.
Coalescing & herd defenses: single-flight on concurrent same-ISBN requests, plus stale-while-revalidate and TTL jitter to survive hot-key expiry.
Partner selection & quota respect: ranking candidates per ISBN instead of querying all 500, and multi-layer rate limiting that honors partner contractual quotas.

Part 5 — Money correctness: order-vs-charge consistency, crash recovery & duplicate prevention

What This Part Should Cover

Workflow durability: the purchase modeled as a persisted saga / state machine, with each transition written before the next external call so a dead worker resumes from a known state.
Justified ordering: an explicit order-vs-charge decision that makes "money gone, no book" unreachable by construction — typically authorize → order → capture, with charge-first and order-first rejected for stated reasons.
Idempotency & DB invariants: deterministic idempotency keys on each external call plus DB uniqueness / compare-and-swap transitions so retries and concurrent workers can't double-charge or double-order.
Crash recovery & the non-idempotent store: an outbox + reconciliation sweeper, and a concrete plan to discover whether an order landed (lookup by reference id) when the store offers no idempotency key.

What a Strong Answer Covers

These dimensions span all parts — the interviewer is listening for them across the whole session, not inside any single Part:

Requirements & scope: functional rules restated cleanly, plus explicit non-functional targets (latency budget, fault tolerance, money correctness, partner protection) and a rough capacity estimate that frames the fan-out problem.
Clean decomposition: a deliberate split between a latency-bounded, cache-heavy read/aggregation path and a strictly-correct write/money path , sustained consistently across every part.
Tradeoffs named out loud: what strains first, and the freshness-vs-load and latency-vs-completeness dials, with the candidate choosing a point rather than hand-waving.
Observability: the metrics and dashboards you would page on (fan-out size, cache/coalescing hit rate, per-partner health, breaker transitions, % partial results, payment/order success rates, stuck-intent reconciliation).

Follow-up Questions

How does the design change at 100x request volume , or when the partner count grows from hundreds to tens of thousands of stores — what breaks first, and which lever do you pull?
If the quoted price becomes a hard contractual obligation (you must honor whatever you quote), how do caching and the order step change?
A partner store does not support order idempotency keys and your worker crashes after dispatching an order but before recording it — how do you avoid a duplicate order without a guaranteed-exactly-once call?
Suppose inventory for some titles is genuinely scarce and must be reserved before payment — how does the consistency model and the order-vs-charge decision change?

Design a Book Price Aggregator

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — End-to-end architecture & API

What This Part Should Cover

Part 2 — Fan-out, deadlines & result aggregation

What This Part Should Cover

Part 3 — Tolerating downstream failures (timeouts, circuit breakers, bulkheads)

What This Part Should Cover

Part 4 — Scaling: caching, TTLs, coalescing & thundering-herd protection

What This Part Should Cover

Part 5 — Money correctness: order-vs-charge consistency, crash recovery & duplicate prevention

What This Part Should Cover

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP

Design a Book Price Aggregator

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — End-to-end architecture & API

What This Part Should Cover

Part 2 — Fan-out, deadlines & result aggregation

What This Part Should Cover

Part 3 — Tolerating downstream failures (timeouts, circuit breakers, bulkheads)

What This Part Should Cover

Part 4 — Scaling: caching, TTLs, coalescing & thundering-herd protection

What This Part Should Cover

Part 5 — Money correctness: order-vs-charge consistency, crash recovery & duplicate prevention

What This Part Should Cover

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP