Explain SLI/SLO/SLA and design monitoring

Q: Explain SLI/SLO/SLA and design monitoring

This question evaluates a candidate's competency in service reliability and observability, specifically understanding and distinguishing SLIs, SLOs, and SLAs, planning error budgets, and designing monitoring and alerting for a production web API.

Q: How do I approach Software Engineering Fundamentals interview questions?

Software Engineering Fundamentals questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master software engineering fundamentals interviews.

Question

SLI vs SLO vs SLA for a Web API; Error Budgets; Monitoring and Alerting Design

Context: You are designing reliability goals and on-call policies for a production web API that serves JSON over HTTPS. Requests include a mix of GET/POST endpoints. You need to define what you measure (SLIs), targets (SLOs), the contractual promise (SLA), plan an error budget for a quarterly SLO, and design monitoring/alerting that minimizes alert fatigue.

Tasks

Define and contrast SLI, SLO, and SLA. Give concrete SLI examples for:
- Availability (success rate)
- Latency (e.g., request duration under a threshold)
Given a quarterly target SLO, define a reasonable error budget, and show how you would apportion and track its consumption over time.
Design a monitoring and alerting system that minimizes alert fatigue:
- Choose which signals to alert on and why
- Set alert thresholds relative to SLOs
- Aggregate and deduplicate alerts
- Apply multi-window/multi-burn-rate policies
- Define escalation, silencing, and runbook practices

Explain SLI/SLO/SLA and design monitoring

SLI vs SLO vs SLA for a Web API; Error Budgets; Monitoring and Alerting Design

Tasks

Solution

Comments (0)

Explain SLI/SLO/SLA and design monitoring

Overview

SLI vs SLO vs SLA for a Web API; Error Budgets; Monitoring and Alerting Design

Tasks

Solution

Comments (0)