System Design: Multi‑Layer VPC, IP Planning for 2,000 Servers, Fleet Management, Shared Storage, and SSO
Context
Design a production‑grade, multi‑AZ network and platform foundation in a major public cloud that uses VPC constructs (subnets, route tables, security groups). The environment will host a mixed Linux/Windows fleet (~2,000 servers) and multiple services requiring shared file storage and single sign‑on.
Requirements
-
Network segmentation and routing
-
Segment networks into public, private (application), and management layers.
-
Define subnets, route tables, internet/NAT gateways, and inter‑tier routing.
-
Enforce security with security groups and (optionally) network ACLs.
-
CIDR and IP planning for 2,000 servers
-
Plan VPC and subnet CIDR blocks across at least 3 Availability Zones.
-
Estimate required IPs (include per‑subnet reserved addresses) and future growth.
-
Centralized fleet management (Linux and Windows)
-
Propose tools/processes for configuration management, patching, access control, and inventory.
-
Shared storage for concurrent access
-
Recommend a solution (e.g., NFS/EFS, SMB/FSx variants) and compare performance, consistency, and cost.
-
Single sign‑on (SSO)
-
Design SSO so users authenticate once to access multiple services.
-
Include identity provider choice, trust relationships, and token/assertion flows.