In-Memory Cloud Storage Service (Take-home)
Design and implement an in-memory cloud storage service that maps files to their metadata. The project is staged in levels; deliverables at each level should build on prior levels.
Assume a single process, in-memory store (no persistence to disk). Focus on clear APIs, correctness, complexity, and clean core implementations.
Level 1 — Core CRUD
Provide APIs to:
-
Add a file (store at least: name, size, and arbitrary metadata)
-
Retrieve a file by identifier
-
Delete a file by identifier
Requirements:
-
Define request/response schemas (what arguments each API accepts and returns). Return created identifiers where applicable.
-
Define error handling: how not-found, invalid input, and other conflicts are surfaced.
-
Specify time and space complexities for each operation.
Level 2 — Top-k Largest Files
Add APIs to query:
-
The k largest files globally
-
The k largest files per user (if users exist in your model at this point; otherwise introduce them here)
Requirements:
-
Define tie-breakers when sizes are equal (e.g., by file name, then by identifier).
-
Provide complexity and data structure choices.
Level 3 — Users and Quotas; Account Merge
Add users and enforce storage capacity limits (quotas):
-
Users have a capacity limit (bytes). Adding files must respect the user’s quota.
-
Implement merging two users’ accounts.
Requirements:
-
Define how to resolve conflicts during merge, including duplicate file identifiers or names.
-
Define whether merge is all-or-nothing or best-effort and justify your choice.
-
Provide complexities.
Level 4 — Backup and Restore
Support backing up and restoring a user’s files.
Requirements:
-
Define snapshot semantics: point-in-time vs. rolling/incremental.
-
Describe storage overhead trade-offs for your snapshot design (e.g., deep copy vs. copy-on-write).
-
Define restore policy: replace vs. merge behavior, conflict handling, and quota enforcement.
-
Provide complexities.
Deliverables
-
State and justify your data structures and invariants.
-
Provide time/space complexity for each operation.
-
Implement core methods covering all levels. Pseudocode or real code is acceptable; clarity and correctness matter more than language choice.
-
Persistence to a real filesystem is NOT required.