This question evaluates system design and machine-learning engineering competencies for building a multi-tenant image object-detection service, covering distributed systems, MLOps, model serving, data and model versioning, and computer vision considerations in the ML System Design category, and it focuses on practical application with high-level conceptual architectural reasoning. It is commonly asked to assess the ability to balance trade-offs in scalability, latency, accuracy, cost, observability, privacy, and operational resilience while demonstrating understanding of API flows, performance engineering, monitoring, deployment strategies, and model evaluation.
You are designing a multi-tenant cloud service that ingests user images, runs object detection, and serves results via APIs to web/mobile clients and internal services. The system must support both synchronous (request-response) and asynchronous (batch) inference, and be safe, observable, and cost-efficient.
Assume typical production constraints (public cloud, containerized services, GPU-backed model serving) and that you must propose reasonable SLAs/SLOs and scale assumptions. State any additional assumptions you make.
Design the system and cover the following:
Deliver a concise but complete design, with diagrams-as-text or clear component lists, and include small numeric examples where helpful.
Login required