System Design: High-Volume Network I/O Backend (Files and Streaming)
Context
Design a backend service that supports millions of users uploading and downloading large files (hundreds of MB to multi-GB) and/or consuming streaming data. The service must be Internet-facing, production-grade, and cost-efficient.
Assume this is a 45–60 minute technical screen. You may make reasonable assumptions; state them explicitly.
Requirements
-
Functional requirements
-
Upload large files with resumable/multipart support, pause/resume, checksums, and content-type validation.
-
Download files efficiently with range requests, partial reads, and CDN acceleration.
-
Optional: Support a streaming mode (e.g., live video/audio segments) using standard protocols.
-
Metadata management: create/list/get object metadata; versioning; soft delete; lifecycle policies.
-
Access control: per-tenant, per-user permissions; audit logging.
-
Non-functional requirements
-
High availability across zones; durability appropriate for long-term storage.
-
Horizontal scalability to millions of users; low tail latency for control-plane APIs.
-
Cost-aware architecture (storage, egress, CDN, compute).
-
Security and privacy by default.
-
Deliverables
-
Clearly state assumptions and SLOs.
-
Define functional and non-functional requirements.
-
Propose external APIs (endpoints, request/response) and core data models.
-
Capacity estimates and back-of-the-envelope calculations.
-
High-level architecture: load balancing, stateless services, storage, caching, queues, CDN.
-
Strategies for horizontal scaling, performance optimizations, and cost controls.
-
Security and privacy: authentication, authorization, encryption in transit/at rest, rate limiting, multi-tenant isolation.
-
Failure handling: retries, idempotency, consistency choices, disaster recovery.
-
Deployment, observability (logging/metrics/tracing), incident response, rollback.
-
Corner cases (e.g., partial uploads, duplicate requests, slow clients, network partitions) and how to test them.