Return all files under a path
Company: Dropbox
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates filesystem traversal and recursive/iterative algorithm skills, including handling different path types, edge cases such as missing or inaccessible paths, and robustness concerns like symlink cycles when enumerating files.
Constraints
- fs maps each directory path to a pair (subdirs, files) of immediate children.
- Paths are unique strings; the same path is not both a file and a directory.
- The filesystem may be malformed with cycles (a directory listing itself or an ancestor); the traversal must terminate.
- rootPath may be a directory, a file, or absent from the filesystem.
- Return the file paths sorted (any order is acceptable conceptually; sorting makes the result deterministic).
Examples
Input: ({'/': (['/a', '/b'], ['/root.txt']), '/a': ([], ['/a/x.txt', '/a/y.txt']), '/b': (['/b/c'], ['/b/z.txt']), '/b/c': ([], ['/b/c/deep.txt'])}, '/')
Expected Output: ['/a/x.txt', '/a/y.txt', '/b/c/deep.txt', '/b/z.txt', '/root.txt']
Explanation: Full recursive crawl from root: collects /root.txt, the two files under /a, /b/z.txt, and the deeply nested /b/c/deep.txt — 5 files total, returned sorted.
Input: ({'/': (['/a'], ['/r.txt']), '/a': ([], ['/a/1.txt'])}, '/a')
Expected Output: ['/a/1.txt']
Explanation: rootPath is a subdirectory, not the top. Only files under /a are returned; /r.txt (a sibling, above /a) is excluded.
Input: ({'/': (['/a'], ['/file.txt']), '/a': ([], [])}, '/file.txt')
Expected Output: ['/file.txt']
Explanation: rootPath is a file (it appears in /'s files list but is not itself a directory key), so the result is just [rootPath].
Input: ({'/empty': ([], [])}, '/empty')
Expected Output: []
Explanation: rootPath is a directory with no subdirs and no files, so no file paths are collected.
Input: ({'/': (['/a'], []), '/a': ([], ['/a/only.txt'])}, '/missing')
Expected Output: []
Explanation: rootPath '/missing' does not exist anywhere in the filesystem (not a directory key, not referenced), so the function returns [] (not accessible).
Input: ({'/': (['/loop'], ['/top.txt']), '/loop': (['/loop'], ['/loop/inner.txt'])}, '/')
Expected Output: ['/loop/inner.txt', '/top.txt']
Explanation: The directory /loop lists itself as a child (a symlink-style cycle). The visited set prevents re-expanding it, so traversal terminates and each file is collected exactly once.
Hints
- Treat the filesystem as a graph: do an iterative DFS/BFS starting at rootPath, expanding each directory's subdirs and collecting its files.
- Distinguish the three root cases first: not in the filesystem at all -> []; present but not a directory key -> [rootPath] (it's a file); otherwise traverse.
- Keep a `visited` set of directories you have already expanded so a symlink cycle (a directory that lists itself or an ancestor) cannot loop forever.