Implement a Parallel Image Processor
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of parallel and process-based concurrency, batch image-processing pipelines, transform correctness, and per-job fault isolation within the Coding & Algorithms domain.
Part 1: Sequential Virtual Image Processor
Constraints
- `0 <= len(jobs) <= 1000`
- `0 <= len(operations) <= 20` per job
- All input images use only `'RGB'` or `'GRAY'` modes
- For `('scale', f)`, use `int(width * f)` and `int(height * f)` exactly
- A failed job must not write `output_path`
Examples
Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',), ('scale', 0.5)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 6)]}])
Expected Output: ([('ok', 'b', (5, 10, 'GRAY')), ('ok', 'c', (4, 6, 'GRAY'))], {'a': (10, 20, 'RGB'), 'b': (5, 10, 'GRAY'), 'c': (4, 6, 'GRAY')})
Explanation: The second job reads the file produced by the first job.
Input: ({}, [{'input_path': 'missing', 'output_path': 'out', 'operations': [('grayscale',)]}])
Expected Output: ([('error', 'missing_input')], {})
Explanation: The input file does not exist.
Input: ({'x': (3, 3, 'RGB')}, [{'input_path': 'x', 'output_path': 'y', 'operations': [('scale', 2), ('rotate', 90)]}])
Expected Output: ([('error', 'invalid_operation')], {'x': (3, 3, 'RGB')})
Explanation: An unknown operation causes the whole job to fail, and `y` is not written.
Input: ({'p': (1, 1, 'RGB')}, [{'input_path': 'p', 'output_path': 'q', 'operations': [('scale', 0.4)]}, {'input_path': 'p', 'output_path': 'r', 'operations': [('resize', 2, 2)]}])
Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'r', (2, 2, 'RGB'))], {'p': (1, 1, 'RGB'), 'r': (2, 2, 'RGB')})
Explanation: Scaling by `0.4` would make the image `0 x 0`, so the first job fails. The second job still succeeds.
Input: ({'solo': (7, 8, 'GRAY')}, [])
Expected Output: ([], {'solo': (7, 8, 'GRAY')})
Explanation: No jobs means the file system is unchanged.
Hints
- Keep a mutable copy of the file system and update it only after a job fully succeeds.
- Since jobs are sequential, an output from one successful job can become the input to a later job.
Part 2: Parallel Virtual Image Processor with Failure Isolation
Constraints
- `0 <= len(jobs) <= 100000`
- `0 <= len(operations) <= 20` per job
- Jobs are independent and must read only from the original `files` map
- If `max_workers` is `None`, a typical choice is `min(cpu_count, number_of_jobs)`
- If two successful jobs write the same `output_path`, the later job in input order wins in `final_files`
Examples
Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 4)]}], 2)
Expected Output: ([('ok', 'b', (10, 20, 'GRAY')), ('error', 'missing_input')], {'a': (10, 20, 'RGB'), 'b': (10, 20, 'GRAY')})
Explanation: Jobs are independent, so the second job cannot read `b` from the first job's output.
Input: ({'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB')}, [{'input_path': 'x', 'output_path': 'z', 'operations': [('scale', 2)]}, {'input_path': 'y', 'output_path': 'z', 'operations': [('grayscale',)]}], 2)
Expected Output: ([('ok', 'z', (8, 8, 'RGB')), ('ok', 'z', (8, 2, 'GRAY'))], {'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB'), 'z': (8, 2, 'GRAY')})
Explanation: Both jobs succeed, but the later one wins for `z` when outputs are committed in input order.
Input: ({'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB')}, [{'input_path': 'p', 'output_path': 'bad', 'operations': [('scale', 0.4)]}, {'input_path': 'q', 'output_path': 'good', 'operations': [('resize', 5, 1), ('grayscale',)]}], 3)
Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'good', (5, 1, 'GRAY'))], {'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB'), 'good': (5, 1, 'GRAY')})
Explanation: One job fails, but the other still succeeds.
Input: ({'img': (2, 2, 'GRAY')}, [], 4)
Expected Output: ([], {'img': (2, 2, 'GRAY')})
Explanation: Empty batch edge case.
Hints
- Carry each job's original index so you can rebuild results in input order even if workers finish out of order.
- Avoid shared mutable state across processes; let workers compute results independently, then merge successful writes in the parent process.