PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Amazon

Design a Log Collection System

Last updated: Jun 5, 2026

Quick Overview

This question evaluates proficiency in designing large-scale distributed logging and observability systems, including ingestion pipelines, storage and indexing strategies, query architectures, reliability and scaling trade-offs, and operational concerns like retention, alerting, and access control.

  • medium
  • Amazon
  • System Design
  • Software Engineer

Design a Log Collection System

Company: Amazon

Role: Software Engineer

Category: System Design

Difficulty: medium

Interview Round: Technical Screen

Design a scalable log collection system for a company running many services across thousands of machines. The system should collect application and infrastructure logs, tolerate temporary network or downstream failures, support near-real-time search and filtering by service, host, timestamp, severity, and request ID, and provide retention, alerting, access control, and operational monitoring. Describe the architecture, data model, ingestion pipeline, storage and indexing strategy, query path, reliability guarantees, scaling approach, and major trade-offs.

Quick Answer: This question evaluates proficiency in designing large-scale distributed logging and observability systems, including ingestion pipelines, storage and indexing strategies, query architectures, reliability and scaling trade-offs, and operational concerns like retention, alerting, and access control.

Related Interview Questions

  • Design Human Avoidance for Warehouse Robots - Amazon (medium)
  • Design a High-Availability Load Balancer - Amazon (hard)
  • Design a Ride-Hailing Matching System - Amazon (medium)
  • Design a replicated cloud storage service - Amazon (hard)
  • Design a cloud database write path and recovery - Amazon (hard)
Amazon logo
Amazon
May 22, 2026, 12:00 AM
Software Engineer
Technical Screen
System Design
5
0

Design a scalable log collection system for a company running many services across thousands of machines.

The system should collect application and infrastructure logs, tolerate temporary network or downstream failures, support near-real-time search and filtering by service, host, timestamp, severity, and request ID, and provide retention, alerting, access control, and operational monitoring.

Describe the architecture, data model, ingestion pipeline, storage and indexing strategy, query path, reliability guarantees, scaling approach, and major trade-offs.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.