Secure Multitenant SaaS Architecture

What's being tested

Interviewers are probing whether you can design a secure multitenant SaaS system where many enterprise customers share infrastructure without sharing data, permissions, or operational blast radius. For Harvey, this matters because legal workflows involve highly sensitive documents, privileged communications, matter-level permissions, and enterprise audit requirements. A strong Software Engineer answer balances tenant isolation, authorization correctness, data protection, operability, and practical tradeoffs like shared versus dedicated infrastructure. The interviewer is not looking for hand-wavy “use encryption” answers; they want to see where isolation is enforced, how failures are contained, and how you prove the system is safe under real production paths.

Core knowledge

Multitenancy models usually fall into three patterns: shared database/shared schema, shared database/separate schema, and separate database per tenant. Shared schema is cheapest and easiest to operate, but requires flawless tenant scoping; separate databases improve isolation and noisy-neighbor control but increase migrations, connection management, and operational complexity.
Tenant isolation must be enforced at multiple layers, not just in application code. Typical layers include API authentication, authorization middleware, database predicates like tenant_id = ?, Postgres row-level security policies, object-store path scoping in S3, search-index filters, cache key prefixes, queue routing, and observability access controls.
Authentication answers “who are you?” while authorization answers “what can you access?” Enterprise SaaS commonly supports SAML or OIDC single sign-on, maps identity-provider groups into application roles, and uses short-lived session tokens or JWTs with claims such as sub, org_id, roles, and token expiry.
RBAC works well for coarse permissions like admin, member, and viewer; ABAC is better for legal workflows where access may depend on attributes like tenant_id, matter_id, document_classification, jurisdiction, or ethical_wall_group. Many systems combine both: roles grant capabilities, attributes constrain resources.
Authorization checks should happen close to every resource access. A common pattern is authorize(actor, action, resource) backed by policy code or a service like Open Policy Agent. Avoid checking only at the route level if downstream jobs, batch exports, search, or document-preview endpoints can fetch data independently.
Database design should make unsafe queries hard to write. In shared tables, include tenant_id in every tenant-owned table, put composite indexes like (tenant_id, matter_id, created_at), and consider UNIQUE (tenant_id, external_id) rather than globally unique business identifiers. Postgres RLS can enforce tenant_id = current_setting('app.tenant_id').
Object storage isolation needs the same rigor as relational data. Store files under namespaced keys such as tenant/{tenant_id}/matter/{matter_id}/doc/{doc_id}, but do not rely on naming alone; use service-side authorization, scoped signed URLs with short TTLs, bucket policies where possible, and metadata validation before serving bytes.
Search and vector indexes are common leakage points. If using OpenSearch, Elasticsearch, pgvector, or a vector database, every query must include a tenant and permission filter before returning chunks. For sensitive legal documents, retrieval should filter by tenant_id, matter_id, and user-accessible document IDs, not merely post-filter after top-k retrieval.
Encryption includes data in transit via TLS, data at rest via storage encryption, and sometimes per-tenant keys through AWS KMS, GCP KMS, or HashiCorp Vault. Per-tenant envelope encryption improves crypto isolation and supports tenant-specific key rotation or deletion, but adds latency, key-management paths, and failure modes.
Audit logging should capture security-relevant events: login, SSO group sync, permission changes, document upload/download, search, export, admin impersonation, and failed authorization. Use append-only storage semantics, include actor_id, tenant_id, resource_id, action, decision, ip, user_agent, and timestamp, and avoid logging document contents or secrets.
Noisy-neighbor controls protect availability across tenants. Apply per-tenant rate limits, quota checks, worker-pool isolation, queue partitioning, and query timeouts. For example, a large document import for one tenant should not exhaust all background workers or saturate shared Postgres IOPS for other tenants.
Defense in depth assumes one layer will eventually fail. Use secure defaults, centralized middleware, typed tenant context, automated tests for cross-tenant access, static checks for unscoped queries where possible, canary tenants, alerting on authorization denials, and production guardrails like “break glass” admin access with mandatory audit records.

Worked example

Design a secure multitenant document management system for enterprise legal teams

A strong candidate starts by clarifying the tenancy model, sensitivity level, and access patterns: “Are tenants law firms or companies? Do users belong to multiple tenants? Are documents scoped to matters? Do we need enterprise SSO, audit logs, and data residency?” Then they declare assumptions: use shared application services, shared Postgres for metadata, object storage like S3 for files, and strict logical isolation by tenant_id and matter_id.

The answer can be organized around four pillars. First, identity and access: OIDC/SAML login, user-to-tenant membership, RBAC for admin/member/viewer, and ABAC constraints for matter-level access. Second, data isolation: every metadata table has tenant_id, database queries are scoped through a tenant context, Postgres RLS is enabled for critical tables, and object keys are namespaced with signed URLs generated only after authorization. Third, operational security: per-tenant rate limits, audit logs for document access and permission changes, encrypted storage, and key management through KMS. Fourth, background processing: document ingestion jobs carry tenant context explicitly, validate permissions at enqueue time and processing time, and write derived artifacts like OCR text or embeddings into tenant-filtered stores.

One tradeoff to flag is shared database versus database-per-tenant. For most tenants, shared Postgres with RLS and strong indexing is simpler and cost-effective; for very large or regulated tenants, the design can support dedicated databases or dedicated storage buckets as an enterprise isolation tier. The close should show maturity: “If I had more time, I’d detail migration strategy, cross-tenant security tests, disaster recovery, and how we verify that search/vector retrieval never leaks chunks across matters.”

A second angle

Design an audit logging and access-control layer for a multitenant SaaS platform

This framing shifts from storage architecture to correctness and traceability of security decisions. The key move is to define a centralized authorization API such as authorize(actor, action, resource) and require every service path—REST endpoints, background jobs, exports, search, and admin tools—to call it before accessing tenant resources. Audit events should be emitted for both successful and denied sensitive actions, but the log pipeline must avoid leaking document contents, prompts, access tokens, or privileged metadata. Compared with the document-management design, the harder tradeoff is consistency: synchronous audit writes improve compliance confidence but add latency and failure coupling, while asynchronous writes improve availability but need durable queues and retry semantics. A strong answer proposes a hybrid: block on authorization, emit audit events to a durable append-only stream, and use monitoring to detect missing or delayed audit records.

Common pitfalls

Pitfall: Treating tenant_id as a UI filter instead of a security boundary.

A tempting answer is “we’ll add WHERE tenant_id = tenant.id to queries.” That is necessary but not sufficient; strong candidates explain how tenant context is propagated, how unscoped queries are prevented, how RLS or policy checks backstop application mistakes, and how search, caches, object storage, and background jobs are also scoped.

Pitfall: Over-indexing on encryption while ignoring authorization.

“Encrypt everything with KMS” sounds secure but does not prevent an authenticated user from reading the wrong matter if authorization is wrong. Encryption protects against storage compromise and supports key lifecycle controls; access-control correctness protects against the most likely application-layer leaks.

Pitfall: Designing only the happy-path API.

Cross-tenant leaks often happen through secondary paths: CSV export, admin impersonation, document preview, search snippets, OCR jobs, embeddings, cached responses, webhook retries, or support tooling. A better answer explicitly walks through at least one asynchronous or derived-data flow and shows where tenant and permission checks are enforced.

Connections

Interviewers may pivot from this topic into distributed authorization, database schema design, secure background job processing, search/vector retrieval isolation, or observability and incident response. They may also ask how the architecture changes for enterprise requirements like data residency, dedicated infrastructure, customer-managed keys, or legal hold.

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Featured in interview prep guides

Related concepts