This is a classic system design question that tests your ability to handle real-time data at massive scale. Live streaming sits at the intersection of storage, networking, and distributed systems — and interviewers love it because there are so many interesting trade-offs to discuss.
Estimated time: 45 minutes
Before you touch a single architecture box, you need to nail down what "live streaming" actually means for this system. Ask your interviewer clarifying questions — it shows maturity and prevents you from designing the wrong thing.
The key insight here is that live streaming is fundamentally a write-once, read-millions problem. One broadcaster produces content that millions consume simultaneously.
Let's get a feel for the numbers. This helps you make informed decisions about architecture later.
Daily active users: 50 million. Concurrent viewers at peak: 10 million. Concurrent broadcasters at peak: 100,000. Average stream duration: 2 hours. Average viewers per stream: 100 (but top streams have millions).
Video is a bandwidth monster. 1080p runs at 6 Mbps, 720p at 3 Mbps, 480p at 1.5 Mbps, 360p at 0.8 Mbps.
Ingest bandwidth (broadcaster to our servers): 100,000 broadcasters x 6 Mbps = 600 Gbps ingest.
Egress bandwidth (our servers to viewers): 10 million viewers x 3 Mbps (average) = 30 Tbps egress.
That 30 Tbps number is exactly why CDNs exist. No single data center can push that much traffic.
If we record all streams: 100,000 streams x 2 hours x 6 Mbps = 540 TB per day of raw video. After transcoding to multiple qualities, roughly 2x that = ~1 PB/day. With 30-day retention for VOD: ~30 PB.
The system is a pipeline: video flows from the broadcaster through several stages before reaching viewers.
RTMP ingest, chunking into 2-6 second segments, transcoding to multiple renditions, and delivery via HLS adaptive bitrate streaming.
Tiered cache architecture with shield/mid-tier layer between origin and edge. Multi-CDN strategy for redundancy and global coverage.
WebSocket connections with fan-out strategies that scale from direct delivery to message sampling based on viewer count.
Asynchronous pipeline that stitches chunks, re-transcodes, generates thumbnails, and stores with lifecycle policies.
Key trade-offs: latency vs quality, consistency vs availability, cost vs performance. The interviewer wants to see you navigate a complex system with many moving parts and make reasonable decisions at each layer.