PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's competency in image processing operations and parallel/concurrent programming, covering basic transformations (grayscale, scale, resize) and considerations for CPU-bound batch processing.

  • medium
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement a Batch Image Processor

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Implement a program that applies basic image transformations to a batch of images. You are given a list of input image paths and, for each image, a sequence of operations to apply in order. Support the following operations: 1. **grayscale**: convert the image to grayscale. 2. **scale(factor)**: multiply both width and height by a positive floating-point scale factor. 3. **resize(width, height)**: resize the image to the exact target dimensions. ### Part 1 Write a function that processes a small batch of images correctly and saves the transformed results. Example function shape: `process_images(tasks: list[ImageTask], output_dir: str) -> list[str]` Where each `ImageTask` contains: - `input_path`: path to the source image - `operations`: ordered list of operations such as `grayscale`, `scale(0.5)`, or `resize(200, 300)` The function should return the output paths in the same order as the input tasks. ### Part 2 Now assume the images are very large and the batch size is also large. The interviewer asks you to improve performance on a multi-core machine. Update your approach so that: - multiple images can be processed in parallel, - output ordering is preserved, - a failure in one task does not corrupt other results, - the design is appropriate for CPU-heavy image processing. You may use a standard Python imaging library such as Pillow.

Quick Answer: This question evaluates a candidate's competency in image processing operations and parallel/concurrent programming, covering basic transformations (grayscale, scale, resize) and considerations for CPU-bound batch processing.

Part 1: Process a Small Batch of Image Tasks

You are given a small batch of image-processing tasks. Because an online judge cannot read or write real image files, each image is represented by metadata: its path, width, height, color mode, and an ordered list of operations. Apply the operations in order and return the metadata that would be saved for each output image. Supported operations are: ('grayscale',) to change the mode to 'L'; ('scale', factor) to multiply width and height by factor, using floor and then clamping each dimension to at least 1; and ('resize', new_width, new_height) to set exact dimensions. The output filename is built from the input filename by inserting '_out' before the extension. If there is no extension, just append '_out'. Return results in the same order as the input tasks.

Constraints

  • 0 <= len(tasks) <= 10^4
  • 1 <= width, height <= 10^6
  • Each task has 0 to 50 operations
  • In this part, all operations are valid: scale factors are positive and resize dimensions are positive integers

Examples

Input: ([{'input_path': 'imgs/cat.png', 'width': 100, 'height': 80, 'mode': 'RGB', 'operations': [('grayscale',), ('scale', 0.5)]}, {'input_path': 'dog.jpg', 'width': 20, 'height': 30, 'mode': 'RGB', 'operations': [('resize', 15, 15)]}], 'out')

Expected Output: [('out/cat_out.png', 50, 40, 'L'), ('out/dog_out.jpg', 15, 15, 'RGB')]

Explanation: The first image becomes grayscale and is scaled down. The second is resized exactly to 15 by 15.

Input: ([{'input_path': 'bird', 'width': 7, 'height': 9, 'mode': 'L', 'operations': []}], 'results')

Expected Output: [('results/bird_out', 7, 9, 'L')]

Explanation: With no operations, the metadata is unchanged. Since there is no extension, '_out' is appended directly.

Input: ([], 'out')

Expected Output: []

Explanation: Edge case: an empty batch produces an empty result list.

Input: ([{'input_path': 'raw/a.bmp', 'width': 3, 'height': 3, 'mode': 'RGB', 'operations': [('scale', 0.2), ('grayscale',), ('resize', 2, 5)]}], 'done')

Expected Output: [('done/a_out.bmp', 2, 5, 'L')]

Explanation: Scaling 3 by 0.2 gives 0.6, floor makes it 0, and clamping makes it 1. After grayscale, resize sets the final size to 2 by 5.

Hints

  1. Process each image independently and simulate the operations one by one.
  2. For scale, apply floor after multiplication, then clamp each dimension with max(1, value).

Part 2: Simulate an Order-Preserving Parallel Image Processor

Now model the core logic of a high-performance batch image processor for a multi-core machine. Real file I/O and actual parallel execution are not required; instead, simulate what an order-preserving process pool would produce. Each image task is independent. A fixed number of workers can process tasks in parallel. Tasks are submitted in input order, and each new task starts on the worker that becomes available first; if multiple workers are free at the same time, choose the smaller worker id. You must preserve output ordering in the returned list even though tasks finish at different times. A task failure must not affect other tasks. Use the same image metadata and operations as Part 1, but now invalid operations are possible. Processing cost rules are: grayscale costs current_width * current_height, scale costs current_width * current_height, and resize costs new_width * new_height. If scale has factor <= 0, resize has a non-positive target dimension, or the operation name is unknown, the task fails immediately before that operation and no output file is produced. Successful tasks still generate output paths using the same '_out' filename rule as Part 1.

Constraints

  • 0 <= len(tasks) <= 2 * 10^5
  • 1 <= workers <= 10^5
  • 1 <= width, height <= 10^6
  • Each task has 0 to 50 operations
  • In this part, operations may be invalid and must be reported per task instead of stopping the whole batch

Examples

Input: ([{'input_path': 'a.png', 'width': 10, 'height': 10, 'mode': 'RGB', 'operations': [('grayscale',), ('scale', 0.5)]}, {'input_path': 'b.jpg', 'width': 4, 'height': 5, 'mode': 'RGB', 'operations': [('resize', 2, 3)]}, {'input_path': 'c.bmp', 'width': 6, 'height': 6, 'mode': 'L', 'operations': []}], 'out', 2)

Expected Output: [('ok', 200, 'out/a_out.png', 5, 5, 'L', None), ('ok', 6, 'out/b_out.jpg', 2, 3, 'RGB', None), ('ok', 6, 'out/c_out.bmp', 6, 6, 'L', None)]

Explanation: Task 1 takes 100 + 100 = 200 time units. Task 2 takes 2 * 3 = 6. Task 3 has no operations, so once worker 1 becomes free at time 6, it finishes immediately at time 6.

Input: ([{'input_path': 'x.png', 'width': 8, 'height': 8, 'mode': 'RGB', 'operations': [('scale', 0.25)]}, {'input_path': 'y.png', 'width': 5, 'height': 5, 'mode': 'RGB', 'operations': [('resize', 0, 4)]}, {'input_path': 'z.png', 'width': 3, 'height': 7, 'mode': 'RGB', 'operations': [('grayscale',)]}], 'res', 2)

Expected Output: [('ok', 64, 'res/x_out.png', 2, 2, 'RGB', None), ('error', 0, None, None, None, None, 'invalid resize'), ('ok', 21, 'res/z_out.png', 3, 7, 'L', None)]

Explanation: The second task fails immediately, so its worker is free again at time 0 and can start the third task without affecting the first task.

Input: ([{'input_path': 'm.png', 'width': 2, 'height': 2, 'mode': 'RGB', 'operations': [('resize', 4, 4), ('scale', 0.5)]}, {'input_path': 'n', 'width': 1, 'height': 9, 'mode': 'RGB', 'operations': [('grayscale',), ('scale', 0.1)]}], 'out', 1)

Expected Output: [('ok', 32, 'out/m_out.png', 2, 2, 'RGB', None), ('ok', 50, 'out/n_out', 1, 1, 'L', None)]

Explanation: With one worker, tasks run sequentially. The first costs 16 + 16 = 32, so the second starts at time 32 and finishes at time 50.

Input: ([{'input_path': 'p.png', 'width': 5, 'height': 5, 'mode': 'RGB', 'operations': [('grayscale',), ('scale', -1.0)]}, {'input_path': 'q.png', 'width': 10, 'height': 1, 'mode': 'RGB', 'operations': [('resize', 1, 1)]}], 'tmp', 1)

Expected Output: [('error', 25, None, None, None, None, 'invalid scale'), ('ok', 26, 'tmp/q_out.png', 1, 1, 'RGB', None)]

Explanation: The first task spends 25 time units on grayscale, then fails before the invalid scale. The next task starts only after time 25 because there is just one worker.

Input: ([], 'out', 4)

Expected Output: []

Explanation: Edge case: no tasks means no results.

Hints

  1. First write a helper that simulates one task: compute its duration, final metadata, or failure.
  2. Then use a min-heap of (available_time, worker_id) to assign each next task to the earliest available worker in O(log workers).
Last updated: Apr 19, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement a Banking System - Anthropic (medium)
  • Implement Persistent Memoization LRU Cache - Anthropic (hard)
  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement a Simplified DNS Resolver - Anthropic (hard)