Design an ACL authorization checking service
Company: Snowflake
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
Design a centralized **authorization (ACL) checking service** used by other internal services to decide whether a principal can perform an action on a resource.
### Context
Multiple microservices (e.g., `Orders`, `Docs`, `Billing`) need consistent access control. Instead of each service implementing authorization logic, they call an internal service to evaluate policies.
### Requirements
**Core functionality**
- Provide a decision API: given `(principal, action, resource, context)` return **ALLOW/DENY**.
- Support common ACL semantics:
- Principals: users and service accounts.
- Resources: hierarchical resources (e.g., `/orgs/{id}/projects/{id}/docs/{id}`) and non-hierarchical resources.
- Actions: `read`, `write`, `delete`, `admin`, etc.
- Groups/roles (optional) and explicit per-resource grants.
- Default deny; explicit deny should override allow (if you choose to support denies).
- Policy management:
- Create/update/delete policies and group memberships.
- Changes should propagate to authorization decisions quickly.
**Non-functional** (make reasonable assumptions and state them)
- Low latency for checks (e.g., single-digit ms p99 inside a region).
- High availability (e.g., 99.99%+).
- High read-to-write ratio (checks are frequent; policy updates are relatively infrequent).
- Multi-tenant support and isolation.
- Strong security (authn/authz for callers, auditability).
### Deliverables
- High-level architecture and main components.
- API design (check + management APIs).
- Data model for ACLs/roles/groups.
- Caching strategy and consistency/invalidation approach.
- Handling scale, multi-region, failure modes, and security.
- Observability: key metrics and logs/audit trails.
Quick Answer: This question evaluates system design and security architecture skills, focusing on centralized authorization/ACL models, API and data modeling, caching and consistency strategies, scalability, multi-tenant isolation, and operational concerns for low-latency, highly available services in a microservices environment.