How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

What difficulty level is this interview question?

This is a hard difficulty Coding & Algorithms question, commonly asked during HR Screen rounds at NVIDIA.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at NVIDIA during technical interviews.

Design and explain robust web APIs for ML inference

Quick Overview

This question evaluates a candidate's proficiency in designing HTTP REST APIs for ML inference, including API endpoints, versioning, idempotency, request/response schemas, authentication/authorization, rate limiting, observability, and backward-compatibility policies, and it falls under the Coding & Algorithms domain for a data scientist role.

Design an HTTP API for Image-Based Model Predictions

Context: Design an HTTP REST API that serves predictions for image inputs (e.g., classification, detection). Assume the service may need both synchronous and asynchronous inference, and will be consumed by first- and third-party clients.

Requirements

Endpoints, Methods, Idempotency, and Versioning

Define core endpoints (e.g., POST /v1/predict for sync, POST /v1/jobs for async, GET /v1/jobs/{id}/status).
Specify HTTP methods and how idempotency is achieved (e.g., Idempotency-Key header).
Define versioning strategy.

Request/Response Schemas, Content Types, Errors, Retries

Provide JSON and multipart request/response schemas and content types.
Define standard error codes and error schema.
Define retry semantics, exponential backoff, and use of idempotency keys.

AuthN/AuthZ, Rate Limiting/Quotas, Audit Logging

Use OAuth2/OIDC with scopes.
Describe rate limiting and quotas.
Describe audit logging requirements.

Backward Compatibility and Deprecation Policy

State which changes are backward compatible and how deprecations are communicated.

Security and Observability

TLS, input validation, JWT verification.
PII handling.
Structured logs, metrics, tracing, request IDs.

Provide a concise OpenAPI 3.0 snippet for one endpoint showing parameters, schema, and error responses.

Quick Overview

Requirements

Endpoints, Methods, Idempotency, and Versioning

Define core endpoints (e.g., POST /v1/predict for sync, POST /v1/jobs for async, GET /v1/jobs/{id}/status).

Specify HTTP methods and how idempotency is achieved (e.g., Idempotency-Key header).

Define versioning strategy.

Request/Response Schemas, Content Types, Errors, Retries

Provide JSON and multipart request/response schemas and content types.

Define standard error codes and error schema.

Define retry semantics, exponential backoff, and use of idempotency keys.

AuthN/AuthZ, Rate Limiting/Quotas, Audit Logging

Use OAuth2/OIDC with scopes.

Describe rate limiting and quotas.

Describe audit logging requirements.

Backward Compatibility and Deprecation Policy

State which changes are backward compatible and how deprecations are communicated.

Security and Observability

TLS, input validation, JWT verification.

PII handling.

Structured logs, metrics, tracing, request IDs.

Provide a concise OpenAPI 3.0 snippet for one endpoint showing parameters, schema, and error responses.

Design and explain robust web APIs for ML inference

Quick Overview

Design and explain robust web APIs for ML inference

Design an HTTP API for Image-Based Model Predictions

Requirements

Submit Your Answer to Earn 20XP

Design and explain robust web APIs for ML inference

Quick Overview

Design and explain robust web APIs for ML inference

Design an HTTP API for Image-Based Model Predictions

Requirements

Submit Your Answer to Earn 20XP