System Design: Scheduled Payment Service
Context
You are designing a backend service that allows end users to:
-
Schedule a payment for a future date/time.
-
Query the status of any scheduled payment.
-
Cancel a payment that has not yet executed.
Assume payments are processed through an external payment gateway, and the system must work reliably at scale across multiple regions.
Requirements
Design the system and cover the following areas:
-
APIs
-
Specify REST and/or gRPC APIs for:
-
Schedule a payment
-
Query payment status
-
Cancel a pending payment
-
Include request/response schema, idempotency, and error handling.
-
Data Model and Indexing
-
Propose tables/collections and key fields.
-
Show indexes required for scheduling queries, user lookups, and idempotency.
-
Scheduling and Execution
-
Describe how jobs are scheduled and executed (e.g., scheduler, message queue, worker fleet).
-
Explain the flow from “scheduled” to “executed”, including the role of delay queues and worker coordination.
-
Idempotency and Deduplication
-
Prevent duplicate creation of the same scheduled payment.
-
Ensure exactly-once semantics for payment submission to the gateway.
-
Time Semantics
-
Ordering and time accuracy requirements.
-
Time zones, daylight saving transitions, and clock drift handling.
-
Failure Handling
-
Retries with exponential backoff and jitter.
-
Dead-letter queues and operational procedures.
-
Consistency and Transactions
-
Guarantees when interacting with the payment gateway.
-
State transitions and transactional boundaries.
-
Security and Compliance
-
PCI considerations, encryption, and PII handling.
-
AuthN/Z for APIs and secrets management.
-
Scale, Partitioning, and High Availability
-
Sharding/partitioning strategy and horizontal scaling of components.
-
Leader election, regional failover, and quorum concerns.
-
Monitoring, Auditing, and Alerting
-
Metrics, logs, traces; audit trail design; alerting thresholds.
-
Disaster Recovery
-
Backups, RPO/RTO targets, and regional recovery strategy.
-
SLAs and Capacity Estimates
-
Availability, latency, accuracy SLAs.
-
Back-of-envelope capacity and cost drivers.