Design scalable media storage and delivery
Company: Meta
Role: Machine Learning Engineer
Category: System Design
Difficulty: hard
Interview Round: Onsite
Design a system to store and deliver user-generated photos and short videos at global scale.
Functional requirements: upload (single and multipart), read/download, delete, list, thumbnail and transcode generation, simple search by owner/metadata, signed access URLs, and per-user quotas.
Non-functional (“below the line”) topics to address explicitly: durability targets (e.g., 11 9s), availability/SLOs, multi-region data layout and replication strategy, consistency model for metadata vs. objects, CDN integration and cache invalidation, encryption in transit/at rest and key management, access control and authZ, abuse prevention and rate limiting, cost model and lifecycle policies (tiering, retention, cold storage, deletion), background processing semantics (idempotency/exactly-once), metadata schema and indexing, observability (metrics, tracing, audit logs), backfill/migrations, disaster recovery and regional failover, performance (p95/p
99), and testing/rollout plans.
Deliverables: API sketch, component diagram, storage choices (object store vs. block/file, metadata store), and capacity planning with rough numbers.
Quick Answer: This question evaluates an engineer's ability to design globally distributed, highly scalable media storage and delivery systems, covering competencies in object storage and metadata modeling, CDN integration, replication and consistency strategies, security and access control, background processing, observability, quotas, and cost/performance trade-offs. It is commonly asked in system design interviews to assess reasoning about durability, availability, latency and cost trade-offs while requiring both high-level architectural thinking and practical implementation details; category: System Design; level of abstraction: combines conceptual architecture with practical application considerations.