PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of parallel and process-based concurrency, batch image-processing pipelines, transform correctness, and per-job fault isolation within the Coding & Algorithms domain.

  • medium
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement a Parallel Image Processor

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are building a batch image-processing utility. Implement a function that receives a list of jobs. Each job contains: - `input_path`: the source image file - `output_path`: where to write the processed image - `operations`: an ordered list of image transformations Support these operations: 1. `grayscale` — convert the image to grayscale. 2. `scale(f)` — multiply both width and height by a positive scaling factor `f`. 3. `resize(w, h)` — resize the image to exactly `w x h` pixels. For each job, load the image, apply the operations in order, and save the final result to `output_path`. Follow-up: 1. First, implement a correct solution for a small number of images. 2. Then improve the implementation for a very large batch. Assume image transformations are CPU-intensive and that jobs are independent. Use process-based parallelism to increase throughput, while still returning one result record per input job and handling per-file failures without stopping the entire batch. Explain any important design choices, such as worker-count selection, memory considerations, and how you would preserve correctness while improving performance.

Quick Answer: This question evaluates understanding of parallel and process-based concurrency, batch image-processing pipelines, transform correctness, and per-job fault isolation within the Coding & Algorithms domain.

Part 1: Sequential Virtual Image Processor

You are given a virtual file system of images. Each image is stored only as metadata `(width, height, mode)`, where `mode` is either `'RGB'` or `'GRAY'`. Implement a correct sequential batch processor for a small number of jobs. Each job contains: - `input_path`: the path to read - `output_path`: the path to write - `operations`: an ordered list of transformations Supported operations: - `('grayscale',)` -> change the mode to `'GRAY'` - `('scale', f)` -> set `width = int(width * f)` and `height = int(height * f)` - `('resize', w, h)` -> set the size to exactly `(w, h)` Jobs run strictly in order. A successful job writes its output into the file system, so later jobs may use earlier outputs as inputs. If a job fails, it must not modify the file system, and processing continues with the next job. A job fails with: - `'missing_input'` if `input_path` does not exist at the time the job starts - `'invalid_operation'` if an operation is unknown or malformed - `'invalid_dimensions'` if scaling/resizing would produce a width or height less than 1, or if the scale factor is not positive

Constraints

  • `0 <= len(jobs) <= 1000`
  • `0 <= len(operations) <= 20` per job
  • All input images use only `'RGB'` or `'GRAY'` modes
  • For `('scale', f)`, use `int(width * f)` and `int(height * f)` exactly
  • A failed job must not write `output_path`

Examples

Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',), ('scale', 0.5)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 6)]}])

Expected Output: ([('ok', 'b', (5, 10, 'GRAY')), ('ok', 'c', (4, 6, 'GRAY'))], {'a': (10, 20, 'RGB'), 'b': (5, 10, 'GRAY'), 'c': (4, 6, 'GRAY')})

Explanation: The second job reads the file produced by the first job.

Input: ({}, [{'input_path': 'missing', 'output_path': 'out', 'operations': [('grayscale',)]}])

Expected Output: ([('error', 'missing_input')], {})

Explanation: The input file does not exist.

Input: ({'x': (3, 3, 'RGB')}, [{'input_path': 'x', 'output_path': 'y', 'operations': [('scale', 2), ('rotate', 90)]}])

Expected Output: ([('error', 'invalid_operation')], {'x': (3, 3, 'RGB')})

Explanation: An unknown operation causes the whole job to fail, and `y` is not written.

Input: ({'p': (1, 1, 'RGB')}, [{'input_path': 'p', 'output_path': 'q', 'operations': [('scale', 0.4)]}, {'input_path': 'p', 'output_path': 'r', 'operations': [('resize', 2, 2)]}])

Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'r', (2, 2, 'RGB'))], {'p': (1, 1, 'RGB'), 'r': (2, 2, 'RGB')})

Explanation: Scaling by `0.4` would make the image `0 x 0`, so the first job fails. The second job still succeeds.

Input: ({'solo': (7, 8, 'GRAY')}, [])

Expected Output: ([], {'solo': (7, 8, 'GRAY')})

Explanation: No jobs means the file system is unchanged.

Hints

  1. Keep a mutable copy of the file system and update it only after a job fully succeeds.
  2. Since jobs are sequential, an output from one successful job can become the input to a later job.

Part 2: Parallel Virtual Image Processor with Failure Isolation

You now need to handle a very large batch of independent image-processing jobs. Each image is still represented only by metadata `(width, height, mode)`, where `mode` is `'RGB'` or `'GRAY'`. Unlike Part 1, every job must read from the original `files` snapshot only. No job may depend on the output of another job. Implement a process-based batch processor that: - uses process parallelism for throughput - returns one result record per input job - preserves the original job order in the returned results - isolates failures so one bad job does not stop the rest - returns a deterministic final file map Transformations are the same: - `('grayscale',)` - `('scale', f)` -> `width = int(width * f)`, `height = int(height * f)` - `('resize', w, h)` Failure rules are also the same: - `'missing_input'` - `'invalid_operation'` - `'invalid_dimensions'` To keep the final file map deterministic, successful outputs are committed in input order after processing completes. If multiple successful jobs write the same `output_path`, the later job in the input list wins.

Constraints

  • `0 <= len(jobs) <= 100000`
  • `0 <= len(operations) <= 20` per job
  • Jobs are independent and must read only from the original `files` map
  • If `max_workers` is `None`, a typical choice is `min(cpu_count, number_of_jobs)`
  • If two successful jobs write the same `output_path`, the later job in input order wins in `final_files`

Examples

Input: ({'a': (10, 20, 'RGB')}, [{'input_path': 'a', 'output_path': 'b', 'operations': [('grayscale',)]}, {'input_path': 'b', 'output_path': 'c', 'operations': [('resize', 4, 4)]}], 2)

Expected Output: ([('ok', 'b', (10, 20, 'GRAY')), ('error', 'missing_input')], {'a': (10, 20, 'RGB'), 'b': (10, 20, 'GRAY')})

Explanation: Jobs are independent, so the second job cannot read `b` from the first job's output.

Input: ({'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB')}, [{'input_path': 'x', 'output_path': 'z', 'operations': [('scale', 2)]}, {'input_path': 'y', 'output_path': 'z', 'operations': [('grayscale',)]}], 2)

Expected Output: ([('ok', 'z', (8, 8, 'RGB')), ('ok', 'z', (8, 2, 'GRAY'))], {'x': (4, 4, 'RGB'), 'y': (8, 2, 'RGB'), 'z': (8, 2, 'GRAY')})

Explanation: Both jobs succeed, but the later one wins for `z` when outputs are committed in input order.

Input: ({'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB')}, [{'input_path': 'p', 'output_path': 'bad', 'operations': [('scale', 0.4)]}, {'input_path': 'q', 'output_path': 'good', 'operations': [('resize', 5, 1), ('grayscale',)]}], 3)

Expected Output: ([('error', 'invalid_dimensions'), ('ok', 'good', (5, 1, 'GRAY'))], {'p': (1, 1, 'RGB'), 'q': (2, 3, 'RGB'), 'good': (5, 1, 'GRAY')})

Explanation: One job fails, but the other still succeeds.

Input: ({'img': (2, 2, 'GRAY')}, [], 4)

Expected Output: ([], {'img': (2, 2, 'GRAY')})

Explanation: Empty batch edge case.

Hints

  1. Carry each job's original index so you can rebuild results in input order even if workers finish out of order.
  2. Avoid shared mutable state across processes; let workers compute results independently, then merge successful writes in the parent process.
Last updated: Apr 19, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement a Simplified DNS Resolver - Anthropic (hard)
  • Implement Task Management and Duplicate Detection - Anthropic (medium)