Scenario
Design a log collection system that gathers logs/events from services running in multiple edge data centers and delivers them to a central platform.
Requirements
-
Multi-region/edge ingestion:
Logs originate from many edge data centers.
-
Secure transport:
Logs must be
encrypted in transit
while being collected and forwarded.
-
(Typically expected) Secure storage:
Consider encryption at rest and access control.
-
The initial prompt may be underspecified; you should clarify:
-
Is this primarily
real-time streaming
(e.g., alerting) or
batch analytics
(warehouse/data lake), or both?
-
Expected throughput (events/sec, peak factors), average event size, and retention period.
-
SLA/SLOs: end-to-end latency, durability (acceptable loss), and availability.
-
Compliance constraints (PII, GDPR/CCPA, data residency).
-
Query patterns (search by trace id, full-text search, aggregations).
Deliverable
Propose an end-to-end architecture including:
-
Edge collection/agent design
-
Cross-DC transport
-
Central ingestion, buffering, storage
-
Encryption/key management approach
-
Failure handling, backpressure, and observability