System Design: Low-Latency, Multi-Channel Notification Platform
You are asked to design a scalable, reliable notification system that can send messages to millions of users with low latency across multiple channels (email, SMS, push).
Assume a large, consumer-facing product operating globally with both transactional (real-time) and bulk/marketing use cases.
Cover the following:
-
Requirements Gathering
-
Functional, non-functional, traffic assumptions, SLAs/latency targets, compliance.
-
High-Level Architecture
-
Core components and data flow for real-time, scheduled, and bulk sends.
-
Data Model
-
Key entities (templates, preferences, messages, attempts, providers, etc.).
-
API Design
-
Producer APIs, admin APIs, idempotency, status callbacks/webhooks.
-
Message Prioritization
-
Priority levels, queueing, fairness, rate limits, quotas.
-
Deduplication
-
Idempotency keys, content-based dedup, time windows.
-
Retries and Failure Handling
-
Backoff, dead-letter queues, poison-pill handling, fallback channels, circuit breaking.
-
Scaling Strategies
-
Partitioning, horizontal scaling, multi-region, autoscaling triggers.
-
Monitoring and Alerting
-
SLIs/SLOs, metrics, logs, traces, runbooks.
-
Cost Considerations
-
Unit economics by channel, routing, batching, budgets, frequency caps.
State reasonable assumptions where needed and explain trade-offs.