Design a Key Management Service (KMS)
You are asked to design a production-grade, multi-tenant Key Management Service. A client provides a key identifier (ID) and receives the corresponding key material or a cryptographic service backed by that key.
Assume this is for a large-scale environment with strict security, availability, and latency requirements.
Requirements
-
APIs
-
Define APIs to create/manage keys and to use keys (e.g., encrypt/decrypt, sign/verify, generate data keys).
-
Support key identifiers, versions, and idempotency.
-
Key Generation and Storage
-
Describe how keys are generated (entropy, algorithms) and where they are stored (HSM vs. software).
-
Address import/bring-your-own-key (BYOK) and export policy.
-
Access Control
-
Specify authentication and authorization models (per-tenant, per-key), including least privilege and policy expressiveness.
-
Auditing
-
Provide comprehensive audit logging that is tamper-evident.
-
Key Rotation
-
Support automatic and manual rotation; define key states (enabled, disabled, scheduled for destruction) and versioning semantics.
-
Usage Limits
-
Define per-key and per-principal quotas, rate limits, and safe usage patterns.
-
Security and Threat Model
-
Enumerate critical threats and mitigations (insider, external attacker, HSM compromise, replay, downgrade, side-channels).
-
Latency and Availability
-
State targets/expectations and techniques (caching, envelope encryption, multi-region, failover) to meet them.
-
Blast Radius of Compromise
-
Discuss isolation boundaries and how to limit impact if a key, service component, region, or HSM is compromised.
Note: Best practice is that master keys never leave the HSM. If requirements demand returning key material, constrain it to explicitly exportable, ephemeral data keys with additional controls.