System Design: Google Photos–like Service (Web + Mobile)
Context
Design a large-scale consumer media service that ingests, stores, indexes, and serves billions of photos and videos across mobile and desktop. Users expect near-instant uploads, fast viewing, powerful search, easy sharing, and seamless sync across devices with strong privacy.
Assume a modern cloud environment with globally distributed users.
Scale Assumptions (to ground trade-offs)
-
Users: 100M MAU, 25M DAU.
-
Uploads per DAU: ~5 photos/day, 0.1 videos/day → ~125M photos/day, 2.5M videos/day.
-
Average sizes: photo 4 MB, video 50 MB.
-
New data/day: ~0.5 PB photos + ~125 TB videos ≈ ~0.625 PB/day.
-
Peak ingest: 3× daily average during regional evenings.
-
Read-heavy: 10× more reads than writes; p95 photo load < 200 ms from cache.
You may adjust numbers slightly if needed; justify trade-offs.
Requirements
-
Client upload workflows
-
Mobile: background/resumable uploads, low battery/data usage, offline queue, idempotency.
-
Desktop: bulk sync, folder watch, conflict resolution.
-
Media ingestion pipeline
-
Edge upload, resumable sessions, virus/abuse scanning, metadata extraction (EXIF), deduplication via content hashes.
-
Storage architecture
-
Object storage for originals and variants; hot/cold tiers; lifecycle policies; encryption.
-
Delivery
-
CDN-backed delivery for images/thumbnails/video streaming; signed URLs; cache strategies.
-
Sharing & permissions
-
Private-by-default; per-item/album ACLs; shareable links; link expiration; collaborative albums.
-
Search & indexing
-
Filter by time/location/EXIF; text tags; face/object recognition; similarity search; privacy-preserving per-user indexes.
-
Thumbnail/transforms
-
Multi-size thumbnails; server-side video transcoding (HLS/DASH); on-demand vs precompute trade-offs.
-
Sync across devices
-
Delta sync; read-your-writes; notifications; conflict resolution.
-
Privacy & compliance
-
E2E transport security; encryption at rest; data residency; GDPR/CCPA deletion; audit logging; safe content handling.
-
Disaster recovery
-
Multi-region durability; RPO/RTO targets; backups; runbooks.
-
Cost awareness
-
High-level capacity planning and monthly cost estimation; cost-reduction levers.
Deliverables
-
High-level architecture and key components.
-
API design (upload, list, get, search, share), idempotency, auth.
-
Data model (logical schema) for users, media, albums, ACLs, indexes.
-
Partitioning/sharding strategies.
-
Consistency vs availability choices per workflow.
-
Reasonable cost estimate and key optimizations.