Design a system to process m input images with n pipelines, producing m×n outputs.
-
Pipelines are sequences of image operations (resize/rotate/filter/etc.).
-
Users can submit jobs (a set of images + one or more pipelines).
-
The system must run at large scale (many images, many jobs) with reasonable cost and reliability.
Answer the following:
-
What components would you build (APIs, storage, queues, workers, metadata DB)?
-
How would you parallelize work and avoid waste (e.g., avoid re-reading the same image repeatedly)?
-
How do you ensure fault tolerance, retries, idempotency, and observability?
-
What are key bottlenecks and optimizations (CPU vs I/O, caching, batching, intermediate results)?
-
How would you justify your scaling approach (threads vs processes vs distributed workers; serverless vs containers)?