Design a constrained in-memory file system
Company: Harvey AI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates design and implementation skills for hierarchical in-memory data structures, string parsing and file-name collision handling under constraints, and the ability to reason about time and space complexity.
Constraints
- 1 <= len(operations) <= 10^4
- 1 <= len(path) <= 200
- Each operation is either ('addFile', path) or ('listEntries', path)
- Each directory may contain at most 5 immediate children
- Paths are case-sensitive and must be absolute
- Repeated slashes, trailing slashes (except '/'), and '.' / '..' segments are invalid
Examples
Input: [('addFile', '/path/to/somewhere/file.txt'), ('listEntries', '/path/to/somewhere'), ('listEntries', '/path/to'), ('listEntries', '/')]
Expected Output: [['file.txt'], ['somewhere'], ['path']]
Explanation: Intermediate directories are created automatically. The three list queries inspect the deepest directory, its parent, and the root.
Input: [('addFile', '/d/file.txt'), ('addFile', '/d/file.txt'), ('addFile', '/d/file(1).txt'), ('listEntries', '/d')]
Expected Output: [['file(1)(1).txt', 'file(1).txt', 'file.txt']]
Explanation: The second file becomes 'file(1).txt'. Adding 'file(1).txt' again becomes 'file(1)(1).txt'. The listing is sorted lexicographically.
Input: [('addFile', '/a/x'), ('addFile', '/b/x'), ('addFile', '/c/x'), ('addFile', '/d/x'), ('addFile', '/e/x'), ('listEntries', '/'), ('addFile', '/z/x'), ('listEntries', '/'), ('listEntries', '/z')]
Expected Output: [['a', 'b', 'c', 'd', 'e'], ['a', 'b', 'c', 'd', 'e'], []]
Explanation: After creating five top-level directories, the root is full. Adding '/z/x' is rejected atomically, so '/z' does not appear.
Input: [('listEntries', '/'), ('addFile', '/'), ('addFile', 'relative.txt'), ('addFile', '/bad//path/file'), ('addFile', '/trail/'), ('addFile', '/ok/file'), ('addFile', '/ok/file/nested'), ('listEntries', '/ok'), ('listEntries', '/ok/file'), ('listEntries', '/ok/')]
Expected Output: [[], ['file'], [], []]
Explanation: Invalid addFile paths do nothing. '/ok/file/nested' is rejected because 'file' is already a file, not a directory. Listing a file path or an invalid path returns [].
Input: [('addFile', '/cfg/.env'), ('addFile', '/cfg/.env'), ('addFile', '/cfg/archive.tar.gz'), ('addFile', '/cfg/archive.tar.gz'), ('listEntries', '/cfg')]
Expected Output: [['.env', '.env(1)', 'archive.tar(1).gz', 'archive.tar.gz']]
Explanation: '.env' is treated as having no extension, so it becomes '.env(1)'. For 'archive.tar.gz', only the last dot starts the extension, so the duplicate becomes 'archive.tar(1).gz'.
Solution
def solution(operations):
class Node:
__slots__ = ('is_file', 'children')
def __init__(self, is_file=False):
self.is_file = is_file
self.children = {} if not is_file else None
root = Node(False)
def parse_path(path, allow_root):
if not isinstance(path, str) or not path.startswith('/'):
return None
if path == '/':
return [] if allow_root else None
if path.endswith('/'):
return None
parts = path.split('/')[1:]
for part in parts:
if part == '' or part == '.' or part == '..':
return None
return parts
def split_name(name):
dot = name.rfind('.')
if dot > 0:
return name[:dot], name[dot:]
return name, ''
def make_unique_name(parent, name):
if name not in parent.children:
return name
base, ext = split_name(name)
k = 1
while True:
candidate = f'{base}({k}){ext}'
if candidate not in parent.children:
return candidate
k += 1
def add_file(path):
parts = parse_path(path, allow_root=False)
if parts is None:
return
node = root
i = 0
while i < len(parts) - 1:
seg = parts[i]
child = node.children.get(seg)
if child is None:
break
if child.is_file:
return
node = child
i += 1
if i < len(parts) - 1:
child_count = len(node.children)
for _ in parts[i:-1]:
if child_count >= 5:
return
child_count = 0
final_name = parts[-1]
else:
if len(node.children) >= 5:
return
final_name = make_unique_name(node, parts[-1])
node = root
for seg in parts[:-1]:
child = node.children.get(seg)
if child is None:
child = Node(False)
node.children[seg] = child
node = child
node.children[final_name] = Node(True)
def list_entries(path):
parts = parse_path(path, allow_root=True)
if parts is None:
return []
node = root
for seg in parts:
if node.is_file:
return []
child = node.children.get(seg)
if child is None:
return []
node = child
if node.is_file:
return []
return sorted(node.children.keys())
answers = []
for op in operations:
if not isinstance(op, (list, tuple)) or len(op) != 2:
continue
action, path = op
if action == 'addFile':
add_file(path)
elif action == 'listEntries':
answers.append(list_entries(path))
return answersTime complexity: Per operation: addFile = O(s + r), listEntries = O(s + c log c), where s is the number of path segments, c <= 5 is the number of children in a directory, and r <= 5 is the number of rename attempts. Therefore each operation is effectively O(s), and the whole function is O(total path length across all operations).. Space complexity: O(N), where N is the total number of created directories and files..
Hints
- A tree (trie-like) structure works well: each directory node maps child names to child nodes, and each node stores whether it is a file or a directory.
- To keep addFile atomic, first validate the path and check all capacity constraints before mutating the structure. For renaming, split on the last dot only if that dot is not the first character.