System Design: Multi-Tenant Calendar at Massive Scale
You are designing a multi-tenant calendar platform used by hundreds of millions of users across organizations. Assume a cloud deployment with multiple regions and mobile/web clients. Design for privacy-preserving sharing across tenants, enterprise admin controls, and interoperability with external calendars.
Core Functional Requirements
-
Events: create, update, delete; single and all-day events; locations, attachments, notes.
-
Attendees: internal users, external emails, groups; organizer vs. attendee roles; RSVP states.
-
Invitations: send, accept/decline/tentative; tracking.
-
Reminders/Notifications: email/push/SMS at configured offsets.
-
Recurrence: RFC 5545 RRULE support (FREQ, INTERVAL, BYDAY, COUNT, UNTIL), exceptions (EXDATE), overrides (instance-level changes).
-
Time zones: store canonical TZ identifiers; DST-safe; local/UTC handling.
-
Shared calendars: personal calendars, team calendars; subscription to others; free-busy visibility.
-
Access control: owner/editor/reader/free-busy-only; tenant boundaries; audit.
Non-Functional Goals
-
Low latency: P50 < 100 ms for reads, P95 < 300 ms globally.
-
Availability: ≥ 99.99% for critical reads; graceful degradation for search.
-
Consistency: strong within a shard for event writes; eventual for search/notifications.
-
Cost efficiency: hot vs. cold storage tiers; precomputation windowing; caching.
Deliverables
-
APIs: REST/gRPC for CRUD on calendars/events, list in time range, invite/RSVP, search (text/attendees/time), manage shares. Include idempotency, pagination, filtering, rate limiting.
-
Data model: Relational schemas (Users, Calendars, CalendarMembers, Events, Attendees, RecurrenceRules, EventExceptions/Overrides, Reminders). Keys, indexes, constraints. Representation of recurrence and time zones. Soft deletes and audit.
-
SQL:
-
List visible events for a user between [start, end].
-
Detect time conflicts for a candidate event.
-
Materialize a single occurrence from RRULE + exceptions.
-
Search by attendee and keyword with proper indexes.
-
Scalability/storage: Partitioning/sharding, secondary indexes, caching (read-through for instances), search indexing, background expansion vs. on-the-fly.
-
Consistency/concurrency: concurrent edits, idempotent retries, outbox pattern, eventual consistency for search/notifications, DR/backups, GDPR/retention.
-
Notifications/integrations: reminder delivery, ICS import/export, external calendar sync, abuse prevention and quotas.
-
Database choices: When to use relational vs. document vs. time-series; justify primary store and any polyglot components.