You are on-call for a production web service. Users report the website is down (timeouts/5xx). You are told the incident turns out to involve a full disk on one or more machines.
Walk through how you would:
Assume a typical setup (load balancer → web/app tier → DB/cache; logs/metrics available). Explain what commands/signals you would look at, what hypotheses you would test, and what pitfalls you’d avoid.
Login required