System Design: Cloud Document Storage and Sharing Service
Context
Design the backend for a large-scale cloud document storage and sharing service (similar to Google Drive). Assume web and mobile clients, global user base, multi-AZ within a region to start, with a path to multi-region active-active.
Requirements
Cover the following end-to-end aspects:
-
Functional requirements
-
Upload and download files
-
Folder hierarchy (create/move/rename)
-
Metadata and search (name, content, tags)
-
Sharing and ACLs (users, groups, link-based)
-
Versioning (restore, diff-friendly)
-
Trash and restore (soft delete, purge after retention)
-
APIs
-
REST or gRPC endpoints for core operations
-
Chunked/resumable upload API
-
Permission management APIs
-
Search API
-
Data model
-
Items (files/folders), versions, ACLs, shares, activities
-
Indices to support listing, search, and permission checks
-
Storage architecture
-
Object store for blobs
-
Metadata store for items/ACLs/versions
-
Indexing/search system for content/name queries
-
Consistency model
-
Read-after-write guarantees, eventual for search
-
Transactional semantics for rename/move/share within a user’s tree
-
Partitioning and replication
-
Sharding strategy for metadata and search
-
Multi-AZ replication; path to multi-region
-
Upload chunking and resumability
-
Initiation, part upload, completion, resume after failures
-
Virus scanning and thumbnails
-
Asynchronous pipeline post-upload
-
Security
-
Encryption in transit and at rest (key management, envelope encryption)
-
Permission checks (inherited ACLs, link tokens)
-
Rate limiting and abuse controls
-
Monitoring and observability
-
Scalability, availability, and cost considerations
-
Rough capacity estimates and a high-level diagram
Provide concrete assumptions where needed, and justify key trade-offs.