Design a log filtering and analytics service
Company: Amazon
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
Design a log processing service that ingests application logs and supports:
(
1) filtering logs by attributes such as service, level, and substring/pattern;
(
2) returning counts of error-level logs over a specified time window; and
(
3) building an hourly histogram for a specific log pattern or ID. Specify ingestion flow, storage and indexing (e.g., time-series partitioning, inverted indexes), and a query API such as filter(query), countErrors(window), and histogramByHour(query, window). Discuss handling late/duplicated events, scalability and partitioning, retention, correctness vs. latency trade-offs, schema design with example fields, and complexity analysis for common queries.
Quick Answer: This question evaluates understanding of distributed systems, real-time data ingestion and enrichment, indexing and storage tiering, query API design, and trade-offs between correctness, latency, and scalability in log analytics.