How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a easy difficulty System Design question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at OpenAI during technical interviews.

Design a Payment System | OpenAI Interview Question

Q: Design a Payment System

This system design question evaluates a candidate's ability to architect a reliable backend payment flow, including idempotent request handling, transactional data modeling, and reconciliation with external processors. It is commonly asked to assess practical distributed systems skills around failure handling, consistency, and scalability under real-world conditions, testing applied architectural reasoning rather than pure theory.

Design a Payment System

Design the backend payment system for an online platform that charges customers for goods and services (for example, a marketplace checkout or a usage-based SaaS/API billing product). Customers pay with credit and debit cards, and the actual money movement is performed by external Payment Service Providers (PSPs) such as Stripe or Adyen rather than by your system directly.

Your system is responsible for orchestrating a payment from "customer clicks Pay" through to a confirmed, recorded transaction: calling the PSP, recording an authoritative internal record of every charge, guaranteeing that a customer is never double-charged on retries, supporting refunds, and reconciling your records against what the PSP reports actually happened. Walk through the end-to-end design: the high-level architecture, the core data model, the read and write paths, how you handle failures and retries, and how the system scales.

Constraints & Assumptions

State your own numbers, but a reasonable scoping is:

~10M customers; peak ~1,000 payment requests/second (with bursty spikes, e.g., a sale or billing run), averaging far lower.
Money is moved by one or more external PSPs over HTTPS; PSP calls have p99 latency in the hundreds of milliseconds and can time out.
Correctness dominates latency: a payment may take a second or two, but it must never double-charge and must never lose a successful charge.
Multiple currencies; refunds (full and partial) are required; chargebacks/disputes exist but can be handled out of band.
The system must produce an auditable financial record and be able to reconcile with PSP settlement reports.
Out of scope (call this out): card-number storage and PCI scope (delegated to the PSP via tokenization), fraud scoring, and tax computation.

Clarifying Questions to Ask

What are we processing — one-time checkouts, recurring subscriptions, usage-based metered billing, or marketplace payouts to third parties? Each changes the data model and money-movement direction.
Do we integrate a single PSP or must we route across several (for redundancy, cost, or geographic coverage)?
What is the consistency/latency expectation at checkout — must the user see a final "paid" result synchronously, or is an async "we'll confirm shortly" acceptable?
Who owns card data and PCI compliance? Are we tokenizing through the PSP so raw PANs never touch our servers?
What are the refund and dispute requirements, and do we need a customer-facing or finance-facing ledger/reporting view?
What regulatory/audit constraints apply (immutability of records, data residency, retention)?

What a Strong Answer Covers Premium

Follow-up Questions

A PSP call times out and you never get a webhook. Walk through exactly how your system converges to the correct state and how long that takes. What does the customer see in the meantime?
How do you guarantee exactly-once effect on the customer's card given that the network gives you at-least-once delivery on both your retries and the PSP's webhooks?
How would you extend the design to route across multiple PSPs (failover when one is down, or cost-based routing) without breaking idempotency or the ledger?
How do you support recurring/subscription billing and dunning (retrying failed renewals) on top of this core?
How do you handle a chargeback/dispute weeks after the original payment, and how is that reflected in the ledger?

Design a Payment System

Constraints & Assumptions

State your own numbers, but a reasonable scoping is:

~10M customers; peak ~1,000 payment requests/second (with bursty spikes, e.g., a sale or billing run), averaging far lower.
Money is moved by one or more external PSPs over HTTPS; PSP calls have p99 latency in the hundreds of milliseconds and can time out.
Correctness dominates latency: a payment may take a second or two, but it must never double-charge and must never lose a successful charge.
Multiple currencies; refunds (full and partial) are required; chargebacks/disputes exist but can be handled out of band.
The system must produce an auditable financial record and be able to reconcile with PSP settlement reports.
Out of scope (call this out): card-number storage and PCI scope (delegated to the PSP via tokenization), fraud scoring, and tax computation.

Clarifying Questions to Ask

What are we processing — one-time checkouts, recurring subscriptions, usage-based metered billing, or marketplace payouts to third parties? Each changes the data model and money-movement direction.
Do we integrate a single PSP or must we route across several (for redundancy, cost, or geographic coverage)?
What is the consistency/latency expectation at checkout — must the user see a final "paid" result synchronously, or is an async "we'll confirm shortly" acceptable?
Who owns card data and PCI compliance? Are we tokenizing through the PSP so raw PANs never touch our servers?
What are the refund and dispute requirements, and do we need a customer-facing or finance-facing ledger/reporting view?
What regulatory/audit constraints apply (immutability of records, data residency, retention)?

What a Strong Answer Covers Premium

Follow-up Questions

A PSP call times out and you never get a webhook. Walk through exactly how your system converges to the correct state and how long that takes. What does the customer see in the meantime?
How do you guarantee exactly-once effect on the customer's card given that the network gives you at-least-once delivery on both your retries and the PSP's webhooks?
How would you extend the design to route across multiple PSPs (failover when one is down, or cost-based routing) without breaking idempotency or the ledger?
How do you support recurring/subscription billing and dunning (retrying failed renewals) on top of this core?
How do you handle a chargeback/dispute weeks after the original payment, and how is that reflected in the ledger?

Design a Payment System

Quick Overview

Design a Payment System

Design a Payment System

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers Premium

Follow-up Questions

Submit Your Answer to Earn 20XP

Design a Payment System

Quick Overview

Design a Payment System

Design a Payment System

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers Premium

Follow-up Questions

Submit Your Answer to Earn 20XP