System Design: Multi-Channel Notification Platform
Context
Design a production-grade notification platform that sends messages via email, SMS, and push for both transactional and campaign use cases. The system must support high scale, multi-tenancy, compliance, and multi-region high availability.
Functional Requirements
-
Public APIs to publish notifications (immediate) and schedule future sends.
-
Preference and subscription management per user and per channel.
-
Idempotency and deduplication to avoid duplicate sends.
-
Per-user and per-channel rate limiting and quotas (also per-tenant).
-
Retries with exponential backoff and jitter; dead-letter handling.
-
Scheduling: immediate and future (one-time and recurring) sends.
-
Fan-out to millions of users (campaigns) with audience targeting.
-
Provider selection and failover across multiple vendors (email/SMS/push).
-
Delivery status tracking and webhooks ingestion from providers.
-
GDPR/CCPA-compliant opt-outs, consent, suppression, and deletion.
-
Multi-tenant isolation (data, quotas, rate limits, RBAC, keys).
-
Observability: metrics, traces, logs, and audit trails.
-
Multi-region, highly available, low latency operations.
Non-Functional Requirements to Address
-
High-level architecture and data model.
-
Storage, queues/streams, workers and scaling choices.
-
Ordering guarantees where required.
-
Message semantics (exactly-once vs at-least-once) and idempotency.
-
Sharding strategy and backpressure handling.
-
Capacity estimates and SLAs/SLOs.
-
Failure modes and load-testing strategies.