How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

What difficulty level is this interview question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Anthropic.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Anthropic during technical interviews.

Find Duplicate Files

Last updated: May 5, 2026

Quick Overview

This question evaluates a candidate's skills in file-system traversal, content deduplication strategies, hashing, large-file I/O handling, concurrency, and robustness to errors and file changes during scans.

Anthropic

Mar 2, 2026, 12:00 AM

Software Engineer

Onsite

Coding & Algorithms

Implement a file deduplication tool.

You are given a root directory containing many files. Return groups of duplicate files. Two files are duplicates if they have identical content.

Your solution should use file size and content hashing to avoid unnecessary full-file comparisons.

Requirements:

Recursively scan the directory.
Group candidate files by file size.
For files with the same size, compute content hashes to identify duplicates.
Return only groups containing at least two duplicate files.
Handle large files without loading entire files into memory at once.
Handle errors such as permission failures or files changing during the scan.

Follow-up discussion:

Is this workload I/O-bound or CPU-bound? How would you determine that?
How would you process very large files?
How would you scale to millions or billions of files?
How would you parallelize the scan and hashing stages?
How would you support near-realtime duplicate detection as files are created or modified?

Comments (0)

Loading comments...

Browse More Questions

More Coding & Algorithms•More Anthropic•More Software Engineer•Anthropic Software Engineer•Anthropic Coding & Algorithms•Software Engineer Coding & Algorithms