Implement a Parallel Image Processor
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of parallel and process-based concurrency, batch image-processing pipelines, transform correctness, and per-job fault isolation within the Coding & Algorithms domain.
Part 1: Sequential Virtual Image Processor
Constraints
- `0 <= len(jobs) <= 1000`
- `0 <= len(operations) <= 20` per job
- All input images use only `'RGB'` or `'GRAY'` modes
- For `('scale', f)`, use `int(width * f)` and `int(height * f)` exactly
- A failed job must not write `output_path`
Examples
Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',), ('scale', 0.5)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 6)]}])
Expected Output: ([('ok', 'b', (5, 10, 'GRAY')), ('ok', 'c', (4, 6, 'GRAY'))], {'a': (10, 20, 'RGB'), 'b': (5, 10, 'GRAY'), 'c': (4, 6, 'GRAY')})
Explanation: The second job reads the file produced by the first job.
Input: ({}, [{'input_path': 'missing', 'output_path': 'out', 'operations': [('grayscale',)]}])
Expected Output: ([('error', 'missing_input')], {})
Explanation: The input file does not exist.
Input: ({'x': (3, 3, 'RGB')}, [{'input_path': 'x', 'output_path': 'y', 'operations': [('scale', 2), ('rotate', 90)]}])
Expected Output: ([('error', 'invalid_operation')], {'x': (3, 3, 'RGB')})
Explanation: An unknown operation causes the whole job to fail, and `y` is not written.
Input: ({'p': (1, 1, 'RGB')}, [{'input_path': 'p', 'output_path': 'q', 'operations': [('scale', 0.4)]}, {'input_path': 'p', 'output_path': 'r', 'operations': [('resize', 2, 2)]}])
Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'r', (2, 2, 'RGB'))], {'p': (1, 1, 'RGB'), 'r': (2, 2, 'RGB')})
Explanation: Scaling by `0.4` would make the image `0 x 0`, so the first job fails. The second job still succeeds.
Input: ({'solo': (7, 8, 'GRAY')}, [])
Expected Output: ([], {'solo': (7, 8, 'GRAY')})
Explanation: No jobs means the file system is unchanged.
Hints
- Keep a mutable copy of the file system and update it only after a job fully succeeds.
- Since jobs are sequential, an output from one successful job can become the input to a later job.
Part 2: Parallel Virtual Image Processor with Failure Isolation
Constraints
- `0 <= len(jobs) <= 100000`
- `0 <= len(operations) <= 20` per job
- Jobs are independent and must read only from the original `files` map
- If `max_workers` is `None`, a typical choice is `min(cpu_count, number_of_jobs)`
- If two successful jobs write the same `output_path`, the later job in input order wins in `final_files`
Examples
Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 4)]}], 2)
Expected Output: ([('ok', 'b', (10, 20, 'GRAY')), ('error', 'missing_input')], {'a': (10, 20, 'RGB'), 'b': (10, 20, 'GRAY')})
Input: ({'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB')}, [{'input_path': 'x', 'output_path': 'z', 'operations': [('scale', 2)]}, {'input_path': 'y', 'output_path': 'z', 'operations': [('grayscale',)]}], 2)
Expected Output: ([('ok', 'z', (8, 8, 'RGB')), ('ok', 'z', (8, 2, 'GRAY'))], {'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB'), 'z': (8, 2, 'GRAY')})
Input: ({'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB')}, [{'input_path': 'p', 'output_path': 'bad', 'operations': [('scale', 0.4)]}, {'input_path': 'q', 'output_path': 'good', 'operations': [('resize', 5, 1), ('grayscale',)]}], 3)
Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'good', (5, 1, 'GRAY'))], {'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB'), 'good': (5, 1, 'GRAY')})
Input: ({'img': (2, 2, 'GRAY')}, [], 4)
Expected Output: ([], {'img': (2, 2, 'GRAY')})
Input: ({}, [], None)
Expected Output: ([], {})
Explanation: Both files and jobs empty: empty results, empty final map.
Input: ({'p': (8, 8, 'RGB')}, [{'input_path': 'missing', 'output_path': 'a', 'operations': []}, {'input_path': 'p', 'output_path': 'b', 'operations': [('rotate', 90)]}, {'input_path': 'p', 'output_path': 'c', 'operations': [('resize', 0, 5)]}, {'input_path': 'p', 'output_path': 'e', 'operations': [('grayscale',), ('scale', 2)]}], 1)
Expected Output: ([('error', 'missing_input'), ('error', 'invalid_operation'), ('error', 'invalid_dimensions'), ('ok', 'e', (16, 16, 'GRAY'))], {'p': (8, 8, 'RGB'), 'e': (16, 16, 'GRAY')})
Explanation: missing_input, invalid_operation, invalid_dimensions errors are isolated; the valid job still commits.
Input: ({'x': (10, 10, 'RGB'), 'y': (4, 6, 'GRAY')}, [{'input_path': 'x', 'output_path': 'out', 'operations': [('scale', 2)]}, {'input_path': 'y', 'output_path': 'out', 'operations': [('resize', 3, 3)]}], None)
Expected Output: ([('ok', 'out', (20, 20, 'RGB')), ('ok', 'out', (3, 3, 'GRAY'))], {'x': (10, 10, 'RGB'), 'y': (4, 6, 'GRAY'), 'out': (3, 3, 'GRAY')})
Explanation: Two successes write the same output_path; the later job in input order wins in final_files.
Input: ({'base': (7, 5, 'RGB')}, [{'input_path': 'base', 'output_path': 'base', 'operations': [('scale', 1.5)]}], 2)
Expected Output: ([('ok', 'base', (10, 7, 'RGB'))], {'base': (10, 7, 'RGB')})
Explanation: scale floors dims via int(); a job may overwrite its own input_path in final_files.
Hints
- Carry each job's original index so you can rebuild results in input order even if workers finish out of order.
- Avoid shared mutable state across processes; let workers compute results independently, then merge successful writes in the parent process.