PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Current

Design a metrics monitoring system

Last updated: Mar 29, 2026

Quick Overview

This question evaluates system design and distributed-systems competency, focusing on large-scale time-series monitoring including ingestion pipelines, compression and tiered storage, query language and alerting, multi-tenant isolation, and operational resilience in the System Design category.

  • hard
  • Current
  • System Design
  • Software Engineer

Design a metrics monitoring system

Company: Current

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design a metrics monitoring system. Requirements: collect numeric metrics from many services (counter/gauge/histogram) with labels/tags; support pull vs. push ingestion, high throughput, and backpressure; store time-series efficiently with compression and retention tiers (hot vs. cold storage); provide a query language for aggregations, downsampling, and label filtering; generate alerts on thresholds and SLOs with silencing, deduplication, and routing; ensure high availability, horizontal scalability, and multi-tenant isolation; control cardinality growth and enforce quotas; expose dashboards and APIs; discuss data model, sharding, indexing, write/read paths, failure handling, and consistency choices.

Quick Answer: This question evaluates system design and distributed-systems competency, focusing on large-scale time-series monitoring including ingestion pipelines, compression and tiered storage, query language and alerting, multi-tenant isolation, and operational resilience in the System Design category.

Current logo
Current
Jul 31, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
14
0

System Design: Metrics Monitoring Platform

Context

Design a cloud‑native, multi‑tenant metrics monitoring system for internal services. The system must support counters, gauges, and histograms with labels/tags, ingest via pull and push, provide a query language, alerting, dashboards, and strong operational characteristics (HA, scale, quotas, isolation).

You may assume an illustrative scale (adjust as needed):

  • Aggregate ingest: up to 10M samples/sec across tenants.
  • Retention: 7 days hot, 12 months cold.
  • Query SLO: p99 < 2s for 6h range queries.
  • Availability target: 99.9%.

Requirements

  1. Ingestion
    • Collect numeric metrics (counter/gauge/histogram) with labels/tags.
    • Support pull (scraping endpoints) and push ingestion.
    • Handle high throughput and provide backpressure.
  2. Storage
    • Efficient time‑series storage with compression.
    • Retention tiers: hot vs. cold storage; support downsampling.
  3. Query
    • Provide a query language for aggregations, label filtering, and downsampling.
    • Support federated queries across hot/cold tiers.
  4. Alerting
    • Threshold and SLO‑based alerts; silencing, deduplication, routing.
  5. Operations
    • High availability, horizontal scalability, and multi‑tenant isolation.
    • Control cardinality growth; enforce quotas and rate limits.
    • Expose dashboards and APIs.
  6. Architecture Deep Dives
    • Discuss data model, sharding, indexing.
    • Detail write/read paths, failure handling, and consistency choices.

Deliverables

  • End‑to‑end architecture proposal with components and data flow.
  • Rationale and trade‑offs for key design choices.
  • Guardrails for cardinality, quotas, and backpressure.
  • Failure scenarios and recovery strategies.
  • API surface and operability plan (dashboards, SLOs).

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Current•More Software Engineer•Current Software Engineer•Current System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.