PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/System Design/Meta

Design an in-memory cloud storage system

Last updated: Jun 24, 2026

Quick Overview

This question evaluates system design competencies including API design, in-memory data modeling, efficient data structures for top-k queries, quota and merge semantics, snapshot and restore strategies, and time/space complexity analysis.

  • hard
  • Meta
  • System Design
  • Software Engineer

Design an in-memory cloud storage system

Company: Meta

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Take-home Project

Design and implement an in-memory cloud storage service that maps files to their metadata. Level 1: Provide APIs to add a file, retrieve a file, and delete a file by identifier; store at least name, size, and arbitrary metadata; specify return values and error handling. Level 2: Support querying the k largest files globally and, if applicable, per user; define tie-breakers for equal sizes. Level 3: Add users with storage capacity limits; enforce quotas on adds; implement merging of two users’ accounts and define how to resolve duplicate file identifiers or names. Level 4: Support backing up and restoring a user’s files; define snapshot semantics (point-in-time vs. rolling), storage overhead, and restore conflict policy. State and justify data structures, provide time/space complexities for each operation, and implement core methods. Persistence to a real filesystem is not required.

Quick Answer: This question evaluates system design competencies including API design, in-memory data modeling, efficient data structures for top-k queries, quota and merge semantics, snapshot and restore strategies, and time/space complexity analysis.

Related Interview Questions

  • Design Top-K, Crawler, and Chess Systems - Meta (hard)
  • Design Search And Web Crawling Systems - Meta (medium)
  • Design an Instagram-Style Social Feed - Meta (medium)
  • Design an Online Game Leaderboard - Meta (hard)
  • Design an On-Demand Delivery Platform - Meta (medium)
|Home/System Design/Meta

Design an in-memory cloud storage system

Meta logo
Meta
Sep 6, 2025, 12:00 AM
hardSoftware EngineerTake-home ProjectSystem Design
14
0

In-Memory Cloud Storage Service (Take-home)

Design and implement an in-memory cloud storage service that maps files (objects) to their metadata. The system maintains files and information about them — at minimum a name, a size, and arbitrary metadata. The project is staged in four levels; the deliverables at each level build on the prior ones.

This is a single-process, in-memory store: there is no real filesystem, no disk persistence, and no network. Focus on clear APIs, correctness, time/space complexity, and clean, runnable core implementations. Each level should be implemented so that all of its behaviors are exercisable in code (pseudocode or a real language is fine).

Constraints & Assumptions

  • In-memory only. No persistence to a real filesystem is required. You never store the file bytes — size is just an integer; only metadata is kept.
  • Single process / single thread. You do not need to implement concurrency control, but you should note where locking would go.
  • Metadata is an arbitrary key→value map supplied by the caller; you do not interpret its contents.
  • The levels are cumulative: by Level 3 your model has users; Levels 2 and 4 must be consistent with that model.
  • Every operation must come with a stated time and space complexity , and every mutation must preserve clearly-stated invariants (e.g. per-user byte usage equals the sum of that user's file sizes).
  • Prefer deterministic behavior everywhere ordering or conflict-resolution is involved (no reliance on hash/iteration order).

Clarifying Questions to Ask

  • How are file identifiers assigned — server-minted (e.g. UUIDs) or caller-supplied? This decides whether "duplicate identifier" is even a possible conflict.
  • Are file names unique globally, unique per user, or not unique at all? This determines what a "name conflict" means during merge and restore.
  • Are files mutable after creation (is there an update API), or is re-upload modeled as delete + add?
  • Should get / delete on a missing identifier raise an error or be idempotent (return null / false)?
  • For top-k , do callers want the result strictly ordered, and what is the required tie-break when sizes are equal?
  • For merge and restore , is the expected default all-or-nothing (atomic) or best-effort (apply what fits) , and how should quota interact with each?

Part 1 — Core CRUD

Provide APIs to:

  1. Add a file — store at least its name, size, and arbitrary metadata; return the created identifier.
  2. Retrieve a file by identifier.
  3. Delete a file by identifier.

Define the request/response schema for each API (arguments in, values out), the error model (how not-found, invalid input, and conflicts such as a duplicate name are surfaced), and the time/space complexity of each operation.

What This Part Should Cover

  • A precise schema for each call and which fields are required vs optional; what add returns.
  • A typed error model that distinguishes invalid input, not-found, and conflict (and a clear choice of whether get/delete are lenient/idempotent or raise).
  • Correct O(1)O(1)O(1) get/delete and the cost of add, with the invariants that the indexes maintain.

Part 2 — Top-k Largest Files

Add APIs to query:

  1. The k largest files globally .
  2. The k largest files per user .

Specify the tie-breaker when two files have equal size (e.g. by name, then by identifier) so the result is a total, deterministic order, and justify your data-structure and complexity choices.

What This Part Should Cover

  • A fully specified, deterministic tie-break order and how the chosen structure realizes it.
  • The data structure for global vs per-user top-k and the per-query complexity in terms of kkk and nnn .
  • How deletes (and, later, merges that re-home a file) are reconciled with the index so a query never returns a stale or duplicated entry.

Part 3 — Users and Quotas; Account Merge

Introduce users, each with a storage capacity limit (quota, in bytes). Adding a file must respect the owner's quota. Then implement merging two users' accounts (move one user's files into another).

Define how merge resolves conflicts — including duplicate file identifiers and duplicate names — and decide whether merge is all-or-nothing or best-effort, justifying the choice. Provide complexities.

What This Part Should Cover

  • Quota enforced on every add and on the merge as a whole, with an explicit usage invariant.
  • A conflict policy that names which conflicts are even possible (identifier vs name) and resolves names deterministically.
  • A justified atomic-vs-best-effort choice, with a pre-mutation validation pass for the atomic path, and what the merge returns (e.g. moved / renamed / skipped counts).

Part 4 — Backup and Restore

Support backing up and restoring a user's files.

Define the snapshot semantics (point-in-time vs. rolling/incremental), the storage-overhead trade-off of your snapshot design (e.g. deep copy vs. copy-on-write), and the restore policy: replace vs. merge behavior, conflict handling, and quota enforcement. Provide complexities.

What This Part Should Cover

  • Clear snapshot semantics and an honest accounting of what is copied vs shared, with the overhead trade-off (deep copy vs COW/incremental).
  • A replace vs merge restore policy with explicit conflict handling and a quota pre-check, applied before any destructive mutation.
  • The correctness argument for how restore interacts with an intervening merge (id re-homing) so no other user's account is corrupted.

What a Strong Answer Covers

Across all four levels, a strong submission ties the levels together with a single coherent model rather than four disconnected features:

  • A small set of data structures (primary store + per-user indexes + ordering index) reused across all levels, each justified by the operation it makes cheap.
  • Explicit invariants (usage = sum of sizes; name uniqueness within a user; exactly one owner per live id) that every mutation in Parts 1–4 provably preserves.
  • Consistent complexity reporting in terms of nnn (total files), uuu (a user's files), and kkk , for every operation.
  • The cross-cutting correctness traps that only appear when levels interact: a merge that re-homes a file must not let top-k double-count it; a restore must not rebind an id a merge moved away; quota checks must be net-of-frees.
  • A clear error model and deterministic conflict resolution shared by add, merge, and restore.

Follow-up Questions

  • How would you make this thread-safe ? Where do the locks go, and how do you avoid deadlock when merge touches two users at once?
  • How would you scale beyond one process ? Sketch sharding by owner, how global top-k is computed across shards, and what merge/restore become.
  • If snapshots are taken frequently and metadata is large, when would you switch from deep-copy snapshots to copy-on-write or incremental snapshots, and what does that cost at restore time?
  • How would you support an update-file API (change size/metadata) while keeping the usage invariant and the top-k index correct?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Meta•More Software Engineer•Meta Software Engineer•Meta System Design•Software Engineer System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.