PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Amazon

Design a log filtering and analytics service

Last updated: Jun 18, 2026

Quick Overview

An Amazon software engineer system design question: design a log filtering and analytics service that ingests high-volume application logs and supports attribute/substring filtering, error counts over a time window, and hourly histograms by pattern or ID. It tests ingestion API design, dual search + OLAP storage with hourly rollups, schema and indexing, late/duplicate-event handling, partitioning and retention, and correctness-vs-latency trade-offs with complexity analysis.

  • hard
  • Amazon
  • System Design
  • Software Engineer

Design a log filtering and analytics service

Company: Amazon

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

##### Question Design a log-processing service that ingests application logs at scale and supports the following capabilities: 1. **Filter logs by attributes** — e.g., service/component, level, host, and a substring or regex pattern on the message, scoped to a time range. Expose this as `filter(query)`. 2. **Count error-level logs over a time window** — return the number of ERROR (or higher) logs over a specified window, with optional predicates. Expose this as `countErrors(window)`. 3. **Build an hourly histogram** for a specific log pattern, message predicate, or log ID over a window — returning a count per hour bucket. Expose this as `histogramByHour(query, window)`. In your design, specify: - The **ingestion API and flow** (transport, batching, validation, enrichment, idempotency). - **Storage and indexing choices** — time-series/OLAP partitioning, inverted indexes for substring/regex, and storage tiering (hot/warm/cold). - The **query API** and how each call (`filter`, `countErrors`, `histogramByHour`) is planned and routed. - **Schema design** with example fields. - **Handling of late and duplicated / out-of-order events** (watermarks, deduplication). - **Aggregation strategies** (on-write rollups vs on-read aggregation, caching). - **Scalability, partitioning, and retention**. - **Correctness vs latency / performance trade-offs**. - **Complexity analysis** for the common queries. Provide complexity estimates (big-O and practical latency) for the common queries.

Quick Answer: An Amazon software engineer system design question: design a log filtering and analytics service that ingests high-volume application logs and supports attribute/substring filtering, error counts over a time window, and hourly histograms by pattern or ID. It tests ingestion API design, dual search + OLAP storage with hourly rollups, schema and indexing, late/duplicate-event handling, partitioning and retention, and correctness-vs-latency trade-offs with complexity analysis.

Related Interview Questions

  • Design a Log Collection System - Amazon (medium)
  • Design Human Avoidance for Warehouse Robots - Amazon (medium)
  • Design a High-Availability Load Balancer - Amazon (hard)
  • Design a Ride-Hailing Matching System - Amazon (medium)
  • Design a cloud database write path and recovery - Amazon (hard)
Amazon logo
Amazon
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
7
0
Question

Design a log-processing service that ingests application logs at scale and supports the following capabilities:

  1. Filter logs by attributes — e.g., service/component, level, host, and a substring or regex pattern on the message, scoped to a time range. Expose this as filter(query) .
  2. Count error-level logs over a time window — return the number of ERROR (or higher) logs over a specified window, with optional predicates. Expose this as countErrors(window) .
  3. Build an hourly histogram for a specific log pattern, message predicate, or log ID over a window — returning a count per hour bucket. Expose this as histogramByHour(query, window) .

In your design, specify:

  • The ingestion API and flow (transport, batching, validation, enrichment, idempotency).
  • Storage and indexing choices — time-series/OLAP partitioning, inverted indexes for substring/regex, and storage tiering (hot/warm/cold).
  • The query API and how each call ( filter , countErrors , histogramByHour ) is planned and routed.
  • Schema design with example fields.
  • Handling of late and duplicated / out-of-order events (watermarks, deduplication).
  • Aggregation strategies (on-write rollups vs on-read aggregation, caching).
  • Scalability, partitioning, and retention .
  • Correctness vs latency / performance trade-offs .
  • Complexity analysis for the common queries.

Provide complexity estimates (big-O and practical latency) for the common queries.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.