In-Memory File System: Create, Track Duplicates, and List
Company: Harvey
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates a software engineer's ability to design and implement an in-memory file system, testing data structure selection, path parsing, and collision-resolution logic. It assesses practical object-oriented design skills and edge-case handling, making it a common coding interview problem for roles requiring systems-level thinking.
Constraints
- Total number of operations across a session: up to 10^5.
- Path length: up to 300 characters.
- Names consist of letters, digits, spaces, '.', '(', and ')'.
- Paths are absolute and '/'-separated; root '/' always exists.
- An extension is the last '.' and everything after it; a name with no '.' has an empty extension.
- De-duplication applies only to create_file, never to create_folder.
Examples
Input: ([('create_folder', '/docs'), ('create_file', '/docs/a.txt'), ('create_file', '/docs/a.txt'), ('create_file', '/docs/a.txt'), ('create_folder', '/docs/imgs'), ('list_folder', '/docs'), ('list_folder', '/')],)
Expected Output: ['/docs', '/docs/a.txt', '/docs/a (1).txt', '/docs/a (2).txt', '/docs/imgs', ['a (1).txt', 'a (2).txt', 'a.txt', 'imgs'], ['docs']]
Explanation: The canonical session: repeated create_file on a.txt yields a (1).txt then a (2).txt; list_folder('/docs') returns the four children sorted lexicographically; list_folder('/') returns just 'docs'.
Input: ([('create_folder', '/x'), ('create_file', '/x/notes'), ('create_file', '/x/notes'), ('list_folder', '/x')],)
Expected Output: ['/x', '/x/notes', '/x/notes (1)', ['notes', 'notes (1)']]
Explanation: A name with no extension: 'notes' de-dups to 'notes (1)' with the suffix appended to the end (empty extension).
Input: ([('create_file', '/missing/a.txt'), ('create_folder', '/a/b'), ('list_folder', '/nope')],)
Expected Output: ['ERROR', 'ERROR', 'ERROR']
Explanation: Each operation targets a parent folder that does not exist, so each yields ERROR.
Input: ([('create_file', '/f.txt'), ('create_folder', '/f.txt/sub'), ('create_file', '/f.txt/g.txt'), ('list_folder', '/f.txt')],)
Expected Output: ['/f.txt', 'ERROR', 'ERROR', 'ERROR']
Explanation: After creating the file /f.txt, treating it as a parent folder (create under it, or list it) is invalid because it is a file, not a folder.
Input: ([('create_folder', '/d'), ('create_folder', '/d'), ('create_file', '/d'), ('list_folder', '/d')],)
Expected Output: ['/d', 'ERROR', '/d (1)', []]
Explanation: create_folder on an existing name errors; create_file with the same base name 'd' does NOT error — it de-dups to '/d (1)' (a sibling file), leaving folder /d empty.
Input: ([('create_file', '/a (1).txt'), ('create_file', '/a.txt'), ('create_file', '/a.txt'), ('list_folder', '/')],)
Expected Output: ['/a (1).txt', '/a.txt', '/a (2).txt', ['a (1).txt', 'a (2).txt', 'a.txt']]
Explanation: 'a (1).txt' is created verbatim (no collision). Later, the second 'a.txt' must skip the already-taken 'a (1).txt' and become 'a (2).txt'.
Input: ([('list_folder', '/')],)
Expected Output: [[]]
Explanation: Edge case: listing the empty root returns an empty list.
Input: ([('create_folder', '/a'), ('create_folder', '/a/b'), ('create_file', '/a/b/c.tar.gz'), ('create_file', '/a/b/c.tar.gz'), ('list_folder', '/a/b')],)
Expected Output: ['/a', '/a/b', '/a/b/c.tar.gz', '/a/b/c.tar (1).gz', ['c.tar (1).gz', 'c.tar.gz']]
Explanation: Multi-dot name: the extension is the LAST '.' onward ('.gz'), so the base is 'c.tar' and the de-dup suffix lands before '.gz' → 'c.tar (1).gz'.
Hints
- Model the file system as a tree (trie) of nodes keyed by child name; each node knows whether it is a directory and holds a map of children. The root '/' is pre-created.
- Split each path into its parent components and the final name. Resolve the parent by walking from root; if any step is missing or hits a file, the operation is an error.
- For de-duplication, split the name into base (before the LAST '.') and extension (the '.' onward). Try 'base (1)ext', 'base (2)ext', ... until you find a name not already present in the parent's children.
- create_folder errors if the name already exists as ANY child (file or folder). create_file never errors on a name clash — it renames instead. Sort children lexicographically for list_folder.