Design an HTTP API for Image-Based Model Predictions
Context: Design an HTTP REST API that serves predictions for image inputs (e.g., classification, detection). Assume the service may need both synchronous and asynchronous inference, and will be consumed by first- and third-party clients.
Requirements
-
Endpoints, Methods, Idempotency, and Versioning
-
Define core endpoints (e.g., POST /v1/predict for sync, POST /v1/jobs for async, GET /v1/jobs/{id}/status).
-
Specify HTTP methods and how idempotency is achieved (e.g., Idempotency-Key header).
-
Define versioning strategy.
-
Request/Response Schemas, Content Types, Errors, Retries
-
Provide JSON and multipart request/response schemas and content types.
-
Define standard error codes and error schema.
-
Define retry semantics, exponential backoff, and use of idempotency keys.
-
AuthN/AuthZ, Rate Limiting/Quotas, Audit Logging
-
Use OAuth2/OIDC with scopes.
-
Describe rate limiting and quotas.
-
Describe audit logging requirements.
-
Backward Compatibility and Deprecation Policy
-
State which changes are backward compatible and how deprecations are communicated.
-
Security and Observability
-
TLS, input validation, JWT verification.
-
PII handling.
-
Structured logs, metrics, tracing, request IDs.
-
Provide a concise OpenAPI 3.0 snippet for one endpoint showing parameters, schema, and error responses.