Design and explain robust web APIs for ML inference
Company: NVIDIA
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: hard
Interview Round: HR Screen
Design an HTTP API to serve image-based model predictions. Include: 1) Endpoints (e.g., POST /v1/predict, GET /v1/jobs/{id}/status), methods, idempotency, and versioning. 2) Request/response schemas (JSON + multipart), content types, standard error codes, and retry semantics with exponential backoff and idempotency keys. 3) Authentication/authorization (OAuth2/OIDC with scopes), rate limiting/quotas, and audit logging. 4) Backward compatibility and a deprecation policy. 5) Security (TLS, input validation, JWT verification), PII handling, and observability (structured logs, metrics, tracing, request IDs). 6) Provide a concise OpenAPI 3.0 snippet for one endpoint that captures parameters, schema, and error responses.
Quick Answer: This question evaluates a candidate's proficiency in designing HTTP REST APIs for ML inference, including API endpoints, versioning, idempotency, request/response schemas, authentication/authorization, rate limiting, observability, and backward-compatibility policies, and it falls under the Coding & Algorithms domain for a data scientist role.