Estimation Challenge: YouTube Daily Data Streamed
Estimate how much data YouTube streams to users worldwide in a typical 24-hour period. Walk through assumptions, calculations, sanity checks, and a final point estimate with a plausible range.
Constraints & Assumptions
-
Treat this as a Fermi estimate; the goal is structured reasoning, not exact public reporting.
-
State each assumption before using it.
-
Use daily active users, average watch time, average bitrate, and overhead as the main drivers.
-
Distinguish data delivered to end users from internal CDN or backbone traffic.
Clarifying Questions to Ask
-
Should I estimate all YouTube surfaces, including Shorts, TV, music videos, ads, and livestreams?
-
Are offline downloads included on the day they are downloaded?
-
Should I count only user-delivered bytes or also internal replication and CDN fill traffic?
-
Should I use decimal exabytes or binary exbibytes?
Part 1 - Build the Model
Define the formula and the core variables.
What This Part Should Cover
-
Daily active users.
-
Average watch time per active user.
-
Weighted average bitrate across resolution, device, codec, and content type.
-
Protocol, retransmission, thumbnail, and ad overhead if included.
Part 2 - Calculate a Base Case
Use reasonable assumptions to compute a central estimate.
What This Part Should Cover
-
Unit conversions from users to hours, seconds, bits, bytes, terabytes, petabytes, and exabytes.
-
A transparent arithmetic path that is easy to audit.
-
A clear point estimate.
Part 3 - Sensitivity and Sanity Checks
Bracket the estimate with low and high cases, then sanity-check the result.
What This Part Should Cover
-
Sensitivity to users, watch time, and bitrate.
-
Comparison to known-scale intuition such as billions of watch hours and video dominating consumer internet traffic.
-
Discussion of adaptive bitrate, mobile versus TV, Shorts, 4K, ads, caching, and downloads.
What a Strong Answer Covers
-
Clean assumptions and correct unit handling.
-
A plausible range, not false precision.
-
Explicit treatment of what is counted and what is excluded.
-
Sanity checks that show whether the answer is in the right order of magnitude.
Follow-up Questions
-
Which assumption matters most?
-
How would the estimate change if TV watch time doubled?
-
How would AV1 adoption change the answer?
-
Does CDN caching reduce the number you report?
-
What data would you ask YouTube's analytics team for to improve the estimate?