Design a REST API for Image Inference with Grad-CAM
You are designing a public REST API for an image-inference service that accepts large images and returns both class probabilities and Grad-CAM heatmaps. Assume this is a multi-tenant service with both synchronous and asynchronous workflows and that clients may submit images via URL or upload.
Specify and justify the following:
Functional Scope
-
Accept large images; return top-K class probabilities and Grad-CAM heatmaps.
-
Support single-image sync requests and batch/async processing.
API Surface
-
Endpoint paths and HTTP verbs for:
-
Synchronous inference
-
Asynchronous jobs (submit, poll status, cancel)
-
Batch processing and listing/paginating job results
-
Webhook registration/usage (or per-request callback)
-
Ancillary endpoints (e.g., models listing, health)
-
Request/response schemas (include fields, types, and examples)
-
Idempotency strategy for create/submit endpoints
-
Batching semantics and pagination scheme
-
Async processing with job IDs and webhooks
Cross-Cutting Concerns
-
Rate limiting and headers
-
Authentication/authorization (OAuth2/JWT, scopes)
-
API versioning strategy
-
Retries, timeouts, circuit breaking (client and server guidance)
-
Error taxonomy (structured errors, error codes, HTTP statuses)
Data Handling & Safety
-
Input validation rules (size, dimensions, formats)
-
Content-type checks and malware/zip-bomb defenses
-
Secure storage (encryption at rest, TTL, signed URLs)
-
Ensuring backward compatibility during model upgrades/rollbacks (pinning versions, deprecation)
Keep the design clear, minimal, and production-ready.