System Design: Image Object Detection Service
Design an image detection system that processes user-uploaded images and returns detected objects with bounding boxes and confidence scores.
Clarify and Propose Requirements
-
Online inference SLOs: propose latency targets (p50/p90/p95), availability, and error budgets.
-
Batch throughput: target items/hour and max end-to-end time for large jobs.
-
Expected traffic: peak and average QPS, daily active users, regional traffic patterns.
-
Payloads: supported formats (JPEG/PNG/WebP/HEIC/GIF), max image size (MB, dimensions), typical sizes, and content-type constraints.
-
Accuracy metrics: mAP@0.5 and mAP@[0.5:0.95], per-class precision/recall, calibration (ECE), and confidence thresholds.
Architecture Scope
Propose an end-to-end architecture covering:
-
Ingestion: API/gateway, authentication/authorization, rate limiting, request validation.
-
Storage: object store for images, metadata database for detections and jobs.
-
Preprocessing: format normalization, resizing, EXIF handling, validation, augmentation for offline.
-
Model serving: GPU-backed inference, dynamic batching, autoscaling.
-
Caching: result caching keyed by content-hash and model version.
-
Asynchronous workflows: job API, queues, workers for batch.
-
Model lifecycle: versioning, A/B testing, shadow traffic.
-
Monitoring/observability: latency, throughput, GPU utilization, data drift, accuracy.
-
Failure handling and retries.
-
Offline training pipeline: data labeling, augmentation, experiments, evaluation.
-
Deployment strategy: blue/green or canary.
-
Cost/performance trade-offs.
-
Privacy, security, and compliance.