Design an in-memory file system with limits
Company: Harvey AI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to design and implement hierarchical in-memory data structures, manage capacity constraints and OS-style duplicate naming, and reason about edge cases and time/space complexity.
Constraints
- 1 <= len(operations) <= 10^4
- 1 <= len(path) <= 200 for every operation path
- Each directory can store at most 5 direct entries
- Path segments are non-empty, case-sensitive strings that cannot contain `/`, and segments `.` and `..` are invalid
Examples
Input: [('get', '/')]
Expected Output: [[]]
Explanation: The filesystem starts empty, so the root directory has no children.
Input: [('addFile', '/path/to/somewhere/file.txt'), ('get', '/'), ('get', '/path'), ('get', '/path/to'), ('get', '/path/to/somewhere')]
Expected Output: ['/path/to/somewhere/file.txt', ['path'], ['to'], ['somewhere'], ['file.txt']]
Explanation: Intermediate directories are created automatically. Each `get` returns the immediate children of that directory.
Input: [('addFile', '/docs/file.txt'), ('addFile', '/docs/file.txt'), ('addFile', '/docs/file(1).txt'), ('get', '/docs')]
Expected Output: ['/docs/file.txt', '/docs/file(1).txt', '/docs/file(1)(1).txt', ['file(1)(1).txt', 'file(1).txt', 'file.txt']]
Explanation: The second `file.txt` is renamed to `file(1).txt`. Duplicating `file(1).txt` renames that exact base to `file(1)(1).txt`.
Input: [('addFile', '/bucket/a.txt'), ('addFile', '/bucket/b.txt'), ('addFile', '/bucket/c.txt'), ('addFile', '/bucket/d.txt'), ('addFile', '/bucket/e.txt'), ('addFile', '/bucket/a.txt'), ('get', '/bucket')]
Expected Output: ['/bucket/a.txt', '/bucket/b.txt', '/bucket/c.txt', '/bucket/d.txt', '/bucket/e.txt', None, ['a.txt', 'b.txt', 'c.txt', 'd.txt', 'e.txt']]
Explanation: Directory `/bucket` already has 5 entries, so adding another file there is rejected. Its contents remain unchanged.
Input: [('addFile', '/d1/f.txt'), ('addFile', '/d2/f.txt'), ('addFile', '/d3/f.txt'), ('addFile', '/d4/f.txt'), ('addFile', '/d5/f.txt'), ('addFile', '/d6/sub/f.txt'), ('get', '/')]
Expected Output: ['/d1/f.txt', '/d2/f.txt', '/d3/f.txt', '/d4/f.txt', '/d5/f.txt', None, ['d1', 'd2', 'd3', 'd4', 'd5']]
Explanation: The root directory reaches its 5-entry limit after `d1` through `d5`. The attempt to create `/d6/sub/f.txt` is rejected atomically, so `/d6` does not appear.
Input: [('addFile', '/x.txt'), ('addFile', '/x.txt/y.txt'), ('addFile', '/case/File.txt'), ('addFile', '/case/file.txt'), ('get', '/case'), ('get', '/x.txt'), ('addFile', 'relative/path.txt'), ('addFile', '/bad//name.txt')]
Expected Output: ['/x.txt', None, '/case/File.txt', '/case/file.txt', ['File.txt', 'file.txt'], None, None, None]
Explanation: A file cannot be used as an intermediate directory. Names are case-sensitive, so `File.txt` and `file.txt` are different. Invalid paths are rejected.
Solution
def solution(operations):
class Node:
__slots__ = ('is_file', 'children', 'next_suffix')
def __init__(self, is_file=False):
self.is_file = is_file
if is_file:
self.children = None
self.next_suffix = None
else:
self.children = {}
self.next_suffix = {}
root = Node(False)
def parse_path(path, allow_root):
if not isinstance(path, str) or not path.startswith('/'):
return None
if path == '/':
return [] if allow_root else None
if path.endswith('/'):
return None
parts = path.split('/')[1:]
if not parts:
return None
for part in parts:
if part == '' or part == '.' or part == '..':
return None
return parts
def split_name(name):
idx = name.rfind('.')
if idx > 0:
return name[:idx], name[idx:]
return name, ''
def rollback(created):
for parent, name in reversed(created):
if name in parent.children:
del parent.children[name]
def make_unique(dir_node, name):
base, ext = split_name(name)
key = (base, ext)
suffix = dir_node.next_suffix.get(key, 1)
while True:
candidate = f'{base}({suffix}){ext}'
if candidate not in dir_node.children:
dir_node.next_suffix[key] = suffix + 1
return candidate
suffix += 1
def add_file(path):
parts = parse_path(path, False)
if parts is None:
return None
node = root
created = []
for segment in parts[:-1]:
child = node.children.get(segment)
if child is None:
if len(node.children) >= 5:
rollback(created)
return None
child = Node(False)
node.children[segment] = child
created.append((node, segment))
elif child.is_file:
rollback(created)
return None
node = child
leaf = parts[-1]
existing = node.children.get(leaf)
if existing is None:
if len(node.children) >= 5:
rollback(created)
return None
final_name = leaf
else:
if not existing.is_file:
rollback(created)
return None
if len(node.children) >= 5:
rollback(created)
return None
final_name = make_unique(node, leaf)
node.children[final_name] = Node(True)
return '/' + '/'.join(parts[:-1] + [final_name])
def get_children(path):
parts = parse_path(path, True)
if parts is None:
return None
node = root
for segment in parts:
child = node.children.get(segment)
if child is None or child.is_file:
return None
node = child
return sorted(node.children.keys())
result = []
for op in operations:
if not isinstance(op, (list, tuple)) or len(op) != 2:
result.append(None)
continue
name, path = op
if name == 'addFile':
result.append(add_file(path))
elif name == 'get':
result.append(get_children(path))
else:
result.append(None)
return result
Time complexity: O(T), where T is the total number of path components processed across all operations. Because each directory holds at most 5 children, duplicate-name probing and `get` sorting are O(1) per directory.. Space complexity: O(N), where N is the total number of directories and files successfully created..
Hints
- A tree/trie is a natural fit: each directory node can store a hash map of `name -> child node`.
- Handle duplicate file names only in the parent directory of the leaf. Split the leaf into `base + extension`, then try suffixes `(1)`, `(2)`, ... until you find a free name.