System Design: Multi-Channel Notifications (Email, SMS, Push, In‑App)
Context
Design a notification platform that reliably delivers messages across multiple channels (email, SMS, mobile push, in‑app) while honoring user preferences and regulatory constraints. The system should scale to 10M notifications/day with p99 end-to-end processing latency under 2 seconds (to provider accept), support multi‑region reliability, and provide strong operational observability.
Requirements
-
APIs and Contracts
-
Event production and consumption APIs (idempotent) for transactional and batch use cases.
-
User preference APIs: per‑channel, per‑category, quiet hours (with timezone), daily caps.
-
Templating and localization: variables, safe rendering, locale fallback.
-
Idempotency and de‑duplication keys with TTLs.
-
Scheduling (future send), retries with exponential backoff and jitter.
-
Rate limiting: global, per‑channel, per‑recipient throttling.
-
Delivery status tracking and unified state model.
-
Architecture & Components
-
Producer, dispatcher/orchestrator, queues/streams, worker pools, provider adapters, metadata stores, rate limiter, scheduler, status tracker, analytics sink.
-
Data Models
-
Notification request, user preferences, templates/variants, experiments, rate limits, delivery attempts/events, suppression lists.
-
Delivery Semantics
-
Exactly‑once vs at‑least‑once trade‑offs and where to enforce idempotency/dedupe.
-
Reliability & Scale
-
Targets: 10M/day, p99 < 2s to provider, burst handling, multi‑region (active/active or active/passive), failure handling, circuit breaking, DLQs.
-
Monitoring & Compliance
-
Metrics, logs, traces, alerting; unsubscribe handling; GDPR/CCPA/consent/data‑retention.
-
Evolution Plan
-
Experimentation (A/B), prioritization policies, cost controls and provider routing.
Provide a design with concrete APIs (REST or gRPC), schemas, component interactions, and operational playbooks. Call out assumptions where needed.