PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates implementation and system-design skills across two problems: an in-memory task management service (CRUD operations, prioritized listing, user quotas and assignment semantics, and recording task change history) and a file deduplication routine involving directory traversal and content-based file comparison.

  • medium
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement Task Management and Duplicate Detection

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given two independent coding tasks from a software engineering interview. ### Task 1: Build an in-memory task management system Implement an in-memory task management service. The interviewer will reveal requirements in levels; design your code so later levels can be added cleanly. **Level 1: Basic CRUD** - A task has at least: `task_id`, `description`, `priority`, `created_at`, and optionally an assigned user. - Implement operations to create, read, update, and delete tasks. - Task IDs are unique. Creating a duplicate task ID should fail or return an error. **Level 2: Sorting** - Implement an operation that lists all active tasks sorted by: 1. Higher priority first. 2. If priorities are equal, earlier creation order first. **Level 3: User quotas and assignment** - Add users. Each user has a `user_id` and a maximum quota of active assigned tasks. - Implement assigning a task to a user. - Assignment should fail if the user does not exist, the task does not exist, or the user would exceed their quota. - Define and handle reassignment and unassignment behavior clearly. **Level 4: History queries** - Record a history entry for every meaningful change, such as task creation, update, deletion, assignment, and unassignment. - Support querying the history of a task over time, returning events in chronological order. - You may assume all data is stored in memory unless the interviewer asks otherwise. ### Task 2: Find duplicate files Given a directory tree, implement a function that returns groups of files with identical contents. Each returned group should contain at least two file paths. You may assume access to helper APIs such as: - `list_directory(path) -> entries` - `is_file(path) -> bool` - `get_file_size(path) -> int` - `read_file(path) -> bytes` or a streaming equivalent Discuss edge cases such as empty files, very large files, symbolic links, unreadable files, and avoiding unnecessary reads when many files are unique.

Quick Answer: This question evaluates implementation and system-design skills across two problems: an in-memory task management service (CRUD operations, prioritized listing, user quotas and assignment semantics, and recording task change history) and a file deduplication routine involving directory traversal and content-based file comparison.

Part 1: Task Management - Level 1 Basic CRUD

Implement a simple in-memory task store. Each task has 'task_id', 'description', 'priority', 'created_at', and 'assigned_user' which may be None. Process a list of operations and return the result of each operation in order. Supported operations are: ('create', task_id, description, priority, created_at, assigned_user), ('read', task_id), ('update', task_id, description, priority, assigned_user), and ('delete', task_id). A task ID is globally unique and cannot be reused, even after deletion. 'update' must keep the original 'created_at' value unchanged.

Constraints

  • 0 <= len(operations) <= 100000
  • task_id values are strings and are globally unique across the whole input
  • priority and created_at are integers

Examples

Input: [('create', 't1', 'alpha', 3, 10, None), ('read', 't1'), ('update', 't1', 'beta', 5, 'u1'), ('read', 't1'), ('delete', 't1'), ('read', 't1')]

Expected Output: [True, ('t1', 'alpha', 3, 10, None), True, ('t1', 'beta', 5, 10, 'u1'), True, None]

Explanation: Basic create, read, update, delete flow. The update keeps created_at = 10.

Input: [('create', 't1', 'a', 1, 1, None), ('create', 't1', 'b', 2, 2, None), ('update', 't2', 'x', 3, None), ('delete', 't2')]

Expected Output: [True, False, False, False]

Explanation: Duplicate create fails, and operations on a missing task also fail.

Input: [('create', 'x', 'a', 1, 100, None), ('delete', 'x'), ('create', 'x', 'b', 2, 200, None)]

Expected Output: [True, True, False]

Explanation: Task IDs cannot be reused after deletion.

Input: []

Expected Output: []

Explanation: Edge case: no operations.

Hints

  1. A dictionary keyed by task_id gives O(1) average-time CRUD operations.
  2. Store created_at once on creation and reuse it during updates instead of replacing it.

Part 2: Task Management - Level 2 Sorted Active Task Listing

You are given operations on tasks. Each task has a unique ID and a priority. A task becomes active when it is successfully created and stops being active when deleted. Process the operations and return the result of every 'list' command. Active tasks must be sorted by higher priority first. If priorities are equal, earlier successful creation order comes first. Supported operations are: ('create', task_id, priority), ('update_priority', task_id, new_priority), ('delete', task_id), and ('list',). A task ID cannot be created more than once, even if it was deleted later.

Constraints

  • 0 <= len(operations) <= 100000
  • priority values are integers
  • Only successful 'create' operations assign creation order

Examples

Input: [('create', 't1', 2), ('create', 't2', 5), ('create', 't3', 5), ('list',), ('update_priority', 't1', 6), ('list',)]

Expected Output: [['t2', 't3', 't1'], ['t1', 't2', 't3']]

Explanation: Initially t2 and t3 tie on priority, so earlier creation t2 comes first. After updating t1 to 6, it becomes first.

Input: [('create', 'x', 1), ('delete', 'x'), ('create', 'x', 9), ('list',)]

Expected Output: [[]]

Explanation: The second create is ignored because IDs cannot be reused, so there are no active tasks.

Input: [('list',)]

Expected Output: [[]]

Explanation: Edge case: listing when there are no tasks.

Input: [('create', 'a', 2), ('update_priority', 'missing', 7), ('create', 'b', 2), ('list',)]

Expected Output: [['a', 'b']]

Explanation: Invalid updates are ignored. Tasks with equal priority are ordered by creation time.

Hints

  1. Store a monotonic creation counter when a task is first created successfully.
  2. For each listing, sort by a key like (-priority, creation_order).

Part 3: Task Management - Level 3 User Quotas and Assignment

Implement task assignment with user quotas. Process operations on users and tasks and return the result of each operation in order. Supported operations are: ('add_user', user_id, quota), ('add_task', task_id), ('assign', task_id, user_id), ('unassign', task_id), and ('get_user_tasks', user_id). Rules: adding a duplicate user or task fails; assigning fails if the user or task does not exist; assigning also fails if the target user would exceed their quota; assigning a task already owned by another user is treated as reassignment and should move the task if the target user has room; assigning to the same user succeeds and changes nothing; unassign succeeds only if the task exists and is currently assigned.

Constraints

  • 0 <= len(operations) <= 100000
  • 0 <= quota <= 100000
  • All task_id and user_id values are strings

Examples

Input: [('add_user', 'u1', 1), ('add_user', 'u2', 2), ('add_task', 't1'), ('add_task', 't2'), ('assign', 't1', 'u1'), ('assign', 't2', 'u1'), ('assign', 't2', 'u2'), ('get_user_tasks', 'u1'), ('get_user_tasks', 'u2')]

Expected Output: [True, True, True, True, True, False, True, ['t1'], ['t2']]

Explanation: u1 can hold only one task, so assigning t2 to u1 fails. Assigning t2 to u2 succeeds.

Input: [('add_user', 'u1', 2), ('add_user', 'u2', 1), ('add_task', 't1'), ('assign', 't1', 'u1'), ('assign', 't1', 'u2'), ('get_user_tasks', 'u1'), ('get_user_tasks', 'u2'), ('unassign', 't1'), ('unassign', 't1')]

Expected Output: [True, True, True, True, True, [], ['t1'], True, False]

Explanation: The second assignment reassigns t1 from u1 to u2. Unassigning again fails because the task is already unassigned.

Input: [('add_user', 'u0', 0), ('add_task', 't1'), ('assign', 't1', 'u0'), ('get_user_tasks', 'missing'), ('unassign', 't1')]

Expected Output: [True, True, False, None, False]

Explanation: A user with quota 0 cannot receive any tasks. Querying a missing user returns None.

Input: [('add_user', 'u1', 1), ('add_user', 'u1', 2), ('add_task', 't1'), ('add_task', 't1')]

Expected Output: [True, False, True, False]

Explanation: Duplicate users and duplicate tasks are rejected.

Hints

  1. Track both directions: task -> current user and user -> set of assigned tasks.
  2. Handle the three assignment cases separately: unassigned task, already assigned to same user, and reassignment to a different user.

Part 4: Task Management - Level 4 Task History Queries

Build task history tracking. Process operations and return the result of every 'history' query. Supported operations are: ('create', task_id, description, priority), ('update', task_id, description, priority), ('assign', task_id, user_id), ('unassign', task_id), ('delete', task_id), and ('history', task_id). Record a history entry for every successful meaningful change: create, update, assign, unassign, and delete. If a task is reassigned from one user to another, record ('unassigned', old_user) followed by ('assigned', new_user). Assigning a task to the same user is a no-op and records nothing. Invalid operations are ignored. Task IDs are globally unique and cannot be reused after deletion. History must still be available after a task is deleted.

Constraints

  • 0 <= len(operations) <= 100000
  • task_id and user_id are strings
  • Only successful operations generate history entries

Examples

Input: [('create', 't1', 'alpha', 1), ('update', 't1', 'beta', 3), ('assign', 't1', 'u1'), ('unassign', 't1'), ('delete', 't1'), ('history', 't1')]

Expected Output: [[('created', 'alpha', 1), ('updated', 'beta', 3), ('assigned', 'u1'), ('unassigned', 'u1'), ('deleted',)]]

Explanation: All successful state changes appear in order.

Input: [('create', 't2', 'x', 2), ('assign', 't2', 'alice'), ('assign', 't2', 'alice'), ('assign', 't2', 'bob'), ('history', 't2')]

Expected Output: [[('created', 'x', 2), ('assigned', 'alice'), ('unassigned', 'alice'), ('assigned', 'bob')]]

Explanation: Assigning to the same user is ignored. Reassignment creates unassign and assign events.

Input: [('update', 'missing', 'x', 1), ('history', 'missing'), ('create', 'k', 'a', 1), ('delete', 'k'), ('update', 'k', 'b', 2), ('history', 'k')]

Expected Output: [[], [('created', 'a', 1), ('deleted',)]]

Explanation: Invalid operations do not create history. History on a never-created task is empty.

Input: [('create', 'x', 'a', 1), ('delete', 'x'), ('create', 'x', 'b', 2), ('history', 'x')]

Expected Output: [[('created', 'a', 1), ('deleted',)]]

Explanation: The second create is ignored because task IDs cannot be reused.

Hints

  1. An append-only list per task_id is enough to preserve chronological order.
  2. Treat reassignment as two history events: unassign the old user, then assign the new one.

Part 5: Find Duplicate Files

You are given file records from a directory tree. Each record is a tuple (path, content, readable, is_symlink). Ignore unreadable files and symbolic links. Two files are duplicates if their byte contents are identical. Return all duplicate groups, where each group contains at least two file paths. Inside each group, sort paths lexicographically. Sort the list of groups by the first path in each group. For efficiency, first group files by size before comparing exact content.

Constraints

  • 0 <= len(files) <= 100000
  • The sum of all content lengths is at most 10^6 bytes in the test data
  • Each path is unique

Examples

Input: [('/a.txt', b'abc', True, False), ('/b.txt', b'abc', True, False), ('/c.txt', b'', True, False), ('/d.txt', b'', True, False), ('/e.txt', b'xyz', True, False)]

Expected Output: [['/a.txt', '/b.txt'], ['/c.txt', '/d.txt']]

Explanation: There are two duplicate-content groups: one for b'abc' and one for empty files.

Input: [('/a', b'x', True, False), ('/b', b'x', False, False), ('/c', b'x', True, False), ('/d', b'x', True, True)]

Expected Output: [['/a', '/c']]

Explanation: Unreadable files and symlinks are ignored, so only /a and /c count.

Input: []

Expected Output: []

Explanation: Edge case: no files.

Input: [('/a', b'ab', True, False), ('/b', b'cd', True, False), ('/c', b'ab', True, False)]

Expected Output: [['/a', '/c']]

Explanation: Files /a and /c have identical content. /b has the same size but different bytes.

Hints

  1. Files with different lengths can never be duplicates, so size is a useful first filter.
  2. Choose a deterministic ordering for both paths within a group and the groups themselves.
Last updated: May 30, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement a Simplified DNS Resolver - Anthropic (hard)
  • Implement a hierarchical file store - Anthropic (hard)