PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/DoorDash

Design a Real-Time Monitoring System

Last updated: Apr 12, 2026

Quick Overview

This question evaluates a candidate's system design and distributed-systems competencies, focusing on scalable time-series ingestion, storage, query, alerting, retention policies, and fault-tolerance for monitoring large production fleets; Category: System Design, domain: observability and time-series data.

  • easy
  • DoorDash
  • System Design
  • Software Engineer

Design a Real-Time Monitoring System

Company: DoorDash

Role: Software Engineer

Category: System Design

Difficulty: easy

Interview Round: Onsite

Design a real-time monitoring system for a large production environment. The system should: - collect time-series metrics such as CPU, memory, request count, latency, and error rate from agents running on about 100,000 hosts, - support near-real-time dashboards for engineers, - evaluate alert rules and send notifications quickly when thresholds are breached, - store high-resolution recent data and lower-resolution historical data for long-term retention, - remain reliable during traffic spikes and partial infrastructure failures. Discuss the requirements, APIs or data model, ingestion pipeline, storage design, query path, alerting architecture, scaling strategy, retention policy, and fault tolerance.

Quick Answer: This question evaluates a candidate's system design and distributed-systems competencies, focusing on scalable time-series ingestion, storage, query, alerting, retention policies, and fault-tolerance for monitoring large production fleets; Category: System Design, domain: observability and time-series data.

Related Interview Questions

  • Design a Food Rating System - DoorDash (medium)
  • Design a resilient bootstrap API - DoorDash (medium)
  • Design Real-Time Driver Pay Aggregation - DoorDash (hard)
  • Design Food Ratings and Driver Payouts - DoorDash (medium)
  • Design personalized restaurant search and recommendations - DoorDash (medium)
DoorDash logo
DoorDash
Jan 4, 2026, 12:00 AM
Software Engineer
Onsite
System Design
2
0
Loading...

Design a real-time monitoring system for a large production environment.

The system should:

  • collect time-series metrics such as CPU, memory, request count, latency, and error rate from agents running on about 100,000 hosts,
  • support near-real-time dashboards for engineers,
  • evaluate alert rules and send notifications quickly when thresholds are breached,
  • store high-resolution recent data and lower-resolution historical data for long-term retention,
  • remain reliable during traffic spikes and partial infrastructure failures.

Discuss the requirements, APIs or data model, ingestion pipeline, storage design, query path, alerting architecture, scaling strategy, retention policy, and fault tolerance.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More DoorDash•More Software Engineer•DoorDash Software Engineer•DoorDash System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.