This question evaluates a candidate's ability to design scalable, low-latency real-time log-processing and analytics systems, testing competencies in ingestion APIs, indexing and storage tiering, query and aggregation strategies, deduplication and handling of out-of-order events, and partitioning for scalability, and it falls under the System Design domain. It is commonly asked to assess reasoning about trade-offs between throughput, latency, correctness and operational complexity, and it tests practical architectural design and applied operational considerations rather than purely algorithmic or coding-level detail.
You are designing a real-time log-processing system from scratch. Logs are emitted by many services as structured events (e.g., JSON) with fields like timestamp, level, component, host, request_id, and message. The system must ingest high throughput, index/store logs for retrieval, and support analytical queries with low latency.
Assume:
Design a system that provides:
Specify:
Login required