System Design: Random Number Service (REST + Streaming)
You are designing a high-scale service that generates random numbers and exposes both REST and streaming interfaces. The service must support secure and non-secure modes, per-tenant isolation, and strong observability. Assume internet-facing clients and multi-region deployment.
Specify the following:
-
API Contracts and Versioning
-
Define REST endpoints for generating bytes, integers, and floats (bulk requests). Include request/response schemas, error codes, idempotency, and content types.
-
Define streaming endpoints (e.g., SSE, WebSocket, gRPC) for continuous random data. Include connection setup, backpressure, and chunking.
-
Describe versioning strategy and backward compatibility guarantees.
-
Entropy Sources and RNG Choices
-
Choose CSPRNG(s) for secure mode and PRNG(s) for fast mode. Justify choices and any FIPS-validated options.
-
Describe entropy sources (OS/HW) and reseed strategy.
-
Seeding and Reproducibility
-
Define how clients can request deterministic sequences (e.g., seed + stream_id + offset).
-
Describe unbiased range mapping for integers, precision for floats, and guarantees for identical output across regions/instances.
-
Rate Limiting, Quotas, and Multi-Tenant Isolation
-
Specify per-tenant and per-token rate limits and quotas, burst behavior, headers, and error handling.
-
Describe isolation of RNG state and keys so tenants cannot affect or infer each other’s output.
-
Security and Abuse Prevention
-
Define authentication/authorization, transport security, request validation, and DDoS/WAF controls.
-
Address storage/handling of seeds and audit considerations.
-
Observability
-
Define metrics, logs, and traces. Include RNG health checks, entropy pool telemetry, and quality testing (e.g., periodic Dieharder/NIST STS).
-
Deployment and Scaling Strategy
-
Propose a multi-region architecture, autoscaling, failover, and zero-downtime rollout plan.
-
Include worker design (e.g., vectorized generation, prefetch buffers), state placement, and stream stickiness.
-
SLAs/SLOs
-
Propose latency and availability targets for REST and streaming, including startup latency, sustained throughput, and error budgets.
Assume peak scale of 100k REST requests/sec per region and up to 5 Gbps of streaming throughput per region. Note any assumptions you make.