Image Object Detection System — Requirements and End-to-End Architecture
Context
Design a production system that accepts user-uploaded images and returns detected objects with bounding boxes and confidence scores. The system must support both real-time (online) inference and high-throughput batch processing.
Tasks
-
Clarify and/or state assumptions for:
-
Latency targets for online inference (e.g., p50/p95/p99 budgets) vs. throughput and SLA for batch processing.
-
Expected QPS (average/peak), concurrency, and traffic patterns (diurnal/seasonal).
-
Payload sizes (typical and max image size, formats) and response sizes.
-
Accuracy metrics and acceptance criteria (e.g., mAP, precision/recall, calibration).
-
Propose an end-to-end architecture that covers:
-
Ingestion: API/gateway, authN/authZ, request validation, WAF, rate limiting, idempotency.
-
Storage: object store for images, metadata database for jobs/results, schema.
-
Preprocessing: decode/resize/normalize, where it runs (CPU/GPU), and optimization.
-
Model serving: GPU-backed inference, autoscaling, batching/concurrency, model format.
-
Caching: CDN and application-level result cache (e.g., by image hash), cache invalidation.
-
Asynchronous workflows: queue/stream for batch or slow paths, job orchestration, webhooks/polling.
-
Explain:
-
Model versioning and rollout: registry, A/B testing or shadow traffic, guardrails, rollback.
-
Monitoring/observability: latency/throughput SLOs, GPU utilization, queue lag, errors, drift and data quality.
-
Failure handling and resiliency: timeouts, retries, backpressure, DLQ, degraded modes.
-
Describe the offline ML pipeline:
-
Data labeling and management, augmentation, experiment tracking, evaluation.
-
Deployment strategy: blue/green or canary, progressive delivery.
-
Cost/performance trade-offs: GPU selection, batching, quantization, spot vs. on-demand, caching.
-
Address privacy, security, and compliance considerations:
-
Data retention/deletion, encryption, tenant isolation, least-privilege access, logging minimization, and relevant compliance controls.