System Design: End-to-End Web Request Latency
Context
You are designing a user-facing web experience that fetches HTML/JSON from an origin and additional static assets from a CDN. Users are global (desktop and mobile). The goal is to reduce end-to-end latency (from user action to usable content), with a target of p95 TTFB < 300 ms for the primary HTML/JSON endpoint and a fast, stable render.
Task
Analyze a single web request end-to-end and do the following:
-
Break down latency into client-side, network, and server-side components.
-
Identify likely bottlenecks and how you would measure them.
-
Propose concrete optimizations for each layer (client, network/CDN, server, data store), including expected impact.
-
Provide a small numeric example that computes the current latency and demonstrates how your optimizations reduce it.
-
Outline an instrumentation and rollout plan (validation, guardrails, and success criteria).
Assume HTTP/2 or HTTP/3 is available, TLS is required, and the primary response is cacheable for 60 seconds. Keep your analysis focused and practical for a technical phone screen.