This question evaluates a candidate's understanding of concurrent programming, thread safety, graph traversal, and network I/O within the Coding & Algorithms domain.
Implement a web crawler that, given a starting URL and an interface get_links(url) -> Iterable[str], discovers all pages under the same hostname. Requirements: visit each URL at most once, avoid cycles, and support a fixed-size worker pool for concurrent fetching. Return the set of discovered URLs. Discuss the data structures, how you ensure thread safety, and how you would test it.