PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Anthropic

Implement crawler, dedup, and persistent LRU

Last updated: Mar 29, 2026

Quick Overview

This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Implement crawler, dedup, and persistent LRU states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • Medium
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement crawler, dedup, and persistent LRU

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

##### Question LeetCode 1236. Web Crawler: Crawl web pages starting from a given URL within the same hostname. LeetCode 609. Find Duplicate File in System: Identify duplicate files in a filesystem based on content. LeetCode 146. LRU Cache (extended): Implement an LRU cache decorator that correctly handles variable-length positional and keyword arguments, and add persistence (serialization/deserialization) support. https://leetcode.com/problems/web-crawler/description/ https://leetcode.com/problems/find-duplicate-file-in-system/description/ https://leetcode.com/problems/lru-cache/description/

Quick Answer: This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Implement crawler, dedup, and persistent LRU states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Solution

# Solution Alignment The prompt asks for an implementation-level answer. The safest way to present it is to define the state, maintain clear invariants, then walk through complexity and tests. ## Problem Restatement ##### Question LeetCode 1236. Web Crawler: Crawl web pages starting from a given URL within the same hostname. LeetCode 609. Find Duplicate File in System: Identify duplicate files in a filesystem based on content. LeetCode 146. LRU Cache (extended): Implement an LRU cache decorator that correctly handles variable-length positional and keyword arguments, and add persistence (serialization/deserialization) support. https://leetcode.com/problems/web-crawler/description/ https://leetcode.com/problems/find-duplicate-file-in-system/description/ https://leetcode.com/problems/lru-cache/description/ ## Recommended Approach Use a hash map from key to doubly linked-list node plus a doubly linked list ordered by recency. `get` moves the node to the front. `put` updates and moves an existing node, or inserts a new node at the front and evicts the tail when capacity is exceeded. ## Correctness The implementation should maintain an invariant after each loop or operation that directly matches the problem statement. At termination, that invariant implies the returned value has considered every valid candidate exactly once, or has preserved the required data-structure state after every API call. ## Complexity get and put are O(1) average time. Space is O(capacity). ## Edge Cases and Tests Capacity 0 or 1, updating an existing key, eviction order after get, repeated puts, and missing-key gets.

Related Interview Questions

  • Implement a Banking System - Anthropic (medium)
  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement Task Management and Duplicate Detection - Anthropic (medium)
|Home/Coding & Algorithms/Anthropic

Implement crawler, dedup, and persistent LRU

Anthropic logo
Anthropic
Aug 4, 2025, 10:55 AM
MediumSoftware EngineerOnsiteCoding & Algorithms
94
0

Implement crawler, dedup, and persistent LRU

LeetCode 1236. Web Crawler: Crawl web pages starting from a given URL within the same hostname.

LeetCode 609. Find Duplicate File in System: Identify duplicate files in a filesystem based on content.

LeetCode 146. LRU Cache (extended): Implement an LRU cache decorator that correctly handles variable-length positional and keyword arguments, and add persistence (serialization/deserialization) support.

https://leetcode.com/problems/web-crawler/description/ https://leetcode.com/problems/find-duplicate-file-in-system/description/ https://leetcode.com/problems/lru-cache/description/

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify input sizes, value ranges, mutability, return format, and tie-breaking.
  • State the target time and space complexity before coding.
  • Call out edge cases such as empty inputs, duplicates, invalid values, overflow, and boundary sizes.

What a Strong Answer Covers

  • A clear algorithm with the right data structures and enough pseudocode or code-level detail to implement it.
  • A correctness argument that explains why the algorithm covers all required cases.
  • Time and space complexity, plus at least one alternative approach when relevant.
  • Focused tests for normal cases, edge cases, and failure modes.

Follow-up Questions

  • How would the approach change if the input were streaming or too large for memory?
  • What invariants would you assert in production code?
  • Which tests would catch off-by-one, duplicate, or tie-breaking bugs?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Anthropic•More Software Engineer•Anthropic Software Engineer•Anthropic Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.