How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Take-home Project rounds at Meta.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Meta during technical interviews.

Extend cloud file system with copy and compression

Quick Overview

This question evaluates understanding of data structures and algorithms for stateful resource management, covering bookkeeping of per-user quotas, file metadata operations (copy, compress/decompress), string filtering and sorting, and efficient selection and deletion of largest elements.

Extend cloud file system with copy and compression

Company: Meta

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Take-home Project

## In-Memory Cloud File System V2: Copy, Capacity Updates, Compress/Decompress Design an in-memory file system with per-user quotas and additional operations. ### Data model / rules - Each file has: - `name` (unique string) - `size` (positive int) - `owner` (user id) - Each user has: - `capacity` - `used` - A user may only own files such that `used <= capacity`. ### API 1. `add_user(user_id, capacity) -> bool` - Create a user with `used = 0`. 2. `add_file_by(user_id, name, size) -> bool` - Add a new file owned by `user_id`. - Fail if user missing, name exists, or would exceed capacity. 3. `add_file(name, size) -> bool` - Add a file owned by a special `admin` user with effectively unlimited capacity (or explicitly create admin with very large capacity). 4. `copy_file(from_name, to_name) -> bool` - Duplicate an existing file to a new name. - The copy has the same size and owner as the source. - Fail if `from_name` missing, `to_name` already exists, or the owner’s capacity would be exceeded. 5. `get_file_size(name) -> int | null` - Return size or `null` if missing. 6. `get_n_largest(prefix, suffix, n) -> list<string>` - Filter files with `name` starting with `prefix` and ending with `suffix`. - Return up to `n` formatted: `"name(size)"`. - Sort by size desc, then name asc. 7. `update_capacity(user_id, new_capacity) -> bool` - Update the user’s capacity. - If `used > new_capacity`, repeatedly delete the user’s largest files (tie-break by name asc/desc—state your tie-break clearly) until `used <= new_capacity`. - Return `false` if user missing. 8. `compress_file(name) -> bool` - Convert a file into a compressed version: - Only if `name` exists and does **not** already represent a compressed file. - New file name becomes `name + "compress"` (or another fixed suffix; be consistent). - New size becomes `floor(old_size / 2)`. - Ownership stays the same. - Replace the original (remove old name, add new name). - Update the owner’s `used` accordingly. 9. `decompress_file(name) -> bool` - Only valid if `name` ends with the compression suffix. - New name removes the suffix; new size becomes `old_size * 2`. - Fail if the decompressed name already exists or if the owner would exceed capacity. - Replace the compressed file with the decompressed one. ### Notes - Aim for correct bookkeeping of `used` and efficient deletion of largest files (e.g., heap + lazy deletion). - Assume up to ~1e5 operations.

Quick Answer: This question evaluates understanding of data structures and algorithms for stateful resource management, covering bookkeeping of per-user quotas, file metadata operations (copy, compress/decompress), string filtering and sorting, and efficient selection and deletion of largest elements.

You are given a list of queries to execute against an in-memory cloud file system. Each live file has a unique name, a size, and an owner. Each user has a capacity and currently used space. A reserved user named "admin" exists from the start with capacity 10^18. Process every query and return the result of that API call in order. Use the fixed compression suffix ".cmp". A file is considered compressed if and only if its name ends with this suffix. Operations: - add_user(user_id, capacity) -> bool Create a new user with used = 0. Return false if the user already exists. - add_file_by(user_id, name, size) -> bool Add a new file owned by user_id. Return false if the user does not exist, the name already exists, or adding it would exceed the user's capacity. - add_file(name, size) -> bool Same as add_file_by, but the owner is admin. - copy_file(from_name, to_name) -> bool Duplicate an existing file. The copy keeps the same size and owner. Return false if the source does not exist, the destination already exists, or the owner's capacity would be exceeded. - get_file_size(name) -> int | None Return the file's size, or None if it does not exist. - get_n_largest(prefix, suffix, n) -> list[str] Consider only files whose names start with prefix and end with suffix. Return up to n strings formatted as "name(size)", sorted by size descending, then name ascending. - update_capacity(user_id, new_capacity) -> bool Update the user's capacity. If used space becomes too large, repeatedly delete that user's largest file until used <= capacity. If multiple files have the same largest size, delete the lexicographically smaller name first. Return false only if the user does not exist. - compress_file(name) -> bool Only valid if the file exists and its name does not already end with ".cmp". Replace it with a file named name + ".cmp" and size floor(old_size / 2). Ownership stays the same. Return false if the target name already exists. - decompress_file(name) -> bool Only valid if the file exists and its name ends with ".cmp". Replace it with the name obtained by removing the suffix, and size old_size * 2. Return false if the target name already exists or the owner's capacity would be exceeded. All sizes given in add operations are positive integers. Compression may produce size 0 if a size-1 file is compressed.

Constraints

0 <= len(queries) <= 10^5
User capacities and file sizes in add operations are integers in the range [0, 10^9], and file names are unique among live files
Use lexicographically smaller name first when update_capacity must delete among files with equal size

Examples

Input: [('add_user', 'u1', 15), ('add_file_by', 'u1', 'fileA', 4), ('add_file_by', 'u1', 'fileB', 5), ('copy_file', 'fileA', 'fileA_copy'), ('get_file_size', 'fileA_copy'), ('get_n_largest', 'file', '', 3)]

Expected Output: [True, True, True, True, 4, ['fileB(5)', 'fileA(4)', 'fileA_copy(4)']]

Explanation: The copy succeeds because u1 still has enough remaining capacity. The largest matching files are ordered by size descending, then name ascending.

Input: [('add_user', 'u1', 10), ('add_file_by', 'u1', 'alpha', 4), ('add_file_by', 'u1', 'beta', 4), ('add_file_by', 'u1', 'gamma', 2), ('update_capacity', 'u1', 6), ('get_file_size', 'alpha'), ('get_n_largest', '', '', 5), ('add_file_by', 'u1', 'delta', 1)]

Expected Output: [True, True, True, True, True, None, ['beta(4)', 'gamma(2)'], False]

Explanation: After lowering capacity to 6, one size-4 file must be deleted. alpha and beta tie in size, so alpha is removed first because its name is lexicographically smaller.

Input: [('add_user', 'u1', 5), ('add_file_by', 'u1', 'doc', 4), ('compress_file', 'doc'), ('compress_file', 'doc.cmp'), ('get_file_size', 'doc'), ('get_file_size', 'doc.cmp'), ('decompress_file', 'doc.cmp'), ('add_file_by', 'u1', 'extra', 2)]

Expected Output: [True, True, True, False, None, 2, True, False]

Explanation: doc is replaced by doc.cmp of size 2. Compressing an already compressed file fails. Decompression restores doc to size 4, leaving no room for extra.

Input: [('add_user', 'u1', 8), ('add_file_by', 'u1', 'a', 4), ('compress_file', 'a'), ('add_file_by', 'u1', 'a', 5), ('decompress_file', 'a.cmp'), ('get_n_largest', '', '', 5)]

Expected Output: [True, True, True, True, False, ['a(5)', 'a.cmp(2)']]

Explanation: After compressing a into a.cmp, the original name becomes free and can be reused. Decompression then fails because the target name a already exists.

Input: [('add_user', 'u1', 3), ('add_file_by', 'u2', 'ghost', 1), ('add_file', 'sys', 7), ('copy_file', 'sys', 'sys_backup'), ('get_n_largest', 'sys', '', 5), ('update_capacity', 'nobody', 1)]

Expected Output: [True, False, True, True, ['sys(7)', 'sys_backup(7)'], False]

Explanation: Adding a file for a missing user fails. admin can add and copy files. With equal sizes, sys comes before sys_backup because of lexicographic order.

Input: []

Expected Output: []

Explanation: No queries means no results.

Solution

def solution(queries):
    import heapq

    SUFFIX = '.cmp'
    ADMIN_CAP = 10 ** 18

    users = {'admin': {'cap': ADMIN_CAP, 'used': 0}}
    files = {}
    heaps = {'admin': []}

    def add_owned(user_id, name, size):
        if user_id not in users or name in files:
            return False
        if users[user_id]['used'] + size > users[user_id]['cap']:
            return False
        files[name] = (size, user_id)
        users[user_id]['used'] += size
        heaps.setdefault(user_id, [])
        heapq.heappush(heaps[user_id], (-size, name))
        return True

    answers = []

    for query in queries:
        op = query[0]

        if op == 'add_user':
            _, user_id, capacity = query
            if user_id in users:
                answers.append(False)
            else:
                users[user_id] = {'cap': capacity, 'used': 0}
                heaps[user_id] = []
                answers.append(True)

        elif op == 'add_file_by':
            _, user_id, name, size = query
            answers.append(add_owned(user_id, name, size))

        elif op == 'add_file':
            _, name, size = query
            answers.append(add_owned('admin', name, size))

        elif op == 'copy_file':
            _, from_name, to_name = query
            if from_name not in files or to_name in files:
                answers.append(False)
            else:
                size, owner = files[from_name]
                answers.append(add_owned(owner, to_name, size))

        elif op == 'get_file_size':
            _, name = query
            answers.append(files[name][0] if name in files else None)

        elif op == 'get_n_largest':
            _, prefix, suffix, n = query
            if n <= 0:
                answers.append([])
                continue
            matched = []
            for name, (size, _) in files.items():
                if name.startswith(prefix) and name.endswith(suffix):
                    matched.append((name, size))
            matched.sort(key=lambda item: (-item[1], item[0]))
            answers.append([f"{name}({size})" for name, size in matched[:n]])

        elif op == 'update_capacity':
            _, user_id, new_capacity = query
            if user_id not in users:
                answers.append(False)
            else:
                users[user_id]['cap'] = new_capacity
                heap = heaps.setdefault(user_id, [])
                while users[user_id]['used'] > new_capacity:
                    while heap:
                        neg_size, name = heapq.heappop(heap)
                        size = -neg_size
                        info = files.get(name)
                        if info is not None and info[0] == size and info[1] == user_id:
                            del files[name]
                            users[user_id]['used'] -= size
                            break
                    else:
                        break
                answers.append(True)

        elif op == 'compress_file':
            _, name = query
            if name not in files or name.endswith(SUFFIX):
                answers.append(False)
            else:
                new_name = name + SUFFIX
                if new_name in files:
                    answers.append(False)
                else:
                    size, owner = files[name]
                    new_size = size // 2
                    del files[name]
                    users[owner]['used'] -= size
                    files[new_name] = (new_size, owner)
                    users[owner]['used'] += new_size
                    heaps.setdefault(owner, [])
                    heapq.heappush(heaps[owner], (-new_size, new_name))
                    answers.append(True)

        elif op == 'decompress_file':
            _, name = query
            if name not in files or not name.endswith(SUFFIX):
                answers.append(False)
            else:
                size, owner = files[name]
                new_name = name[:-len(SUFFIX)]
                new_size = size * 2
                if new_name in files or users[owner]['used'] - size + new_size > users[owner]['cap']:
                    answers.append(False)
                else:
                    del files[name]
                    users[owner]['used'] -= size
                    files[new_name] = (new_size, owner)
                    users[owner]['used'] += new_size
                    heaps.setdefault(owner, [])
                    heapq.heappush(heaps[owner], (-new_size, new_name))
                    answers.append(True)

        else:
            raise ValueError(f'Unknown operation: {op}')

    return answers

Time complexity: Amortized O(log F) for add/copy/compress/decompress operations, O(k log F) for update_capacity when k files are deleted, and O(F log F) for each get_n_largest query, where F is the number of live files. Space complexity: O(F + U), where F is the number of live files and U is the number of users.

Hints

Use hash maps for users and files so that existence checks, ownership lookups, and size lookups are O(1).
For update_capacity, keep a max-heap per user keyed by (-size, name). Because files can be deleted or renamed, use lazy deletion when popping from the heap.

Quick Overview

Extend cloud file system with copy and compression

Company: Meta

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Take-home Project

Constraints

0 <= len(queries) <= 10^5
User capacities and file sizes in add operations are integers in the range [0, 10^9], and file names are unique among live files
Use lexicographically smaller name first when update_capacity must delete among files with equal size