System Design: Production-Grade File Storage Service
Problem
Design a production-grade file storage service with the following APIs, semantics, and constraints.
APIs
-
addFile(String path)
-
Creates any missing folders and stores a file at a path like "path/to/somewhere/file.txt".
-
list(String path)
-
Returns the immediate children (files and folders) of the directory at the given path.
Constraints and Semantics
-
Per-directory capacity limit: each directory can contain at most 5 entries (files + folders). Attempts to exceed this limit must be atomically rejected.
-
Name collision handling: duplicate file names in the same directory are auto-renamed using OS-style suffixes, e.g., base (1).ext, base (2).ext. Inputs that already contain such suffixes must be handled correctly (do not double-suffix; accept if free; otherwise continue numbering).
What to Describe
-
Overall architecture: API layer, metadata service, content store.
-
Metadata schema and store choice (relational vs NoSQL).
-
How to enforce the per-directory capacity limit and renaming atomically under concurrent requests.
-
Transaction boundaries and idempotency.
-
Content-addressed vs location-addressed storage and how file bytes are stored (e.g., object storage references).
-
Handling large files (streaming/resumable uploads).
-
Consistency model and failure/rollback.
-
Scalability (partitioning keys, sharding, caching).
-
Observability and rate/quota enforcement.
-
Data lifecycle (retention, deletion, versioning).
-
Security (authn/authz, path traversal protection, encryption in transit/at rest).
-
Define key SLIs/SLOs.
Assume a single logical namespace with a root directory and multi-tenant users. You may add minimal endpoints (e.g., upload sessions) to make large-file handling realistic.