This question evaluates filesystem traversal, efficient I/O and memory use, hashing strategies for content comparison, duplicate-detection algorithms, and time/space complexity analysis.

Detect duplicate files by content in a filesystem. Given access to a directory tree, return groups of file paths that have identical byte content. Minimize disk I/O by first grouping by file size, then using hashing (e.g., fast hash followed by cryptographic hash on collisions) and final byte-by-byte verification. Handle very large files via streaming reads, permission errors, and symbolic links. Provide working code and analyze time/space complexity.