PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/OpenAI

Design Slack-like multi-tenant global messaging system

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design large-scale, multi-tenant real-time messaging systems, testing competencies in distributed systems, data modeling, multi-tenancy, replication, latency optimization, and operational isolation.

  • medium
  • OpenAI
  • System Design
  • Software Engineer

Design Slack-like multi-tenant global messaging system

Company: OpenAI

Role: Software Engineer

Category: System Design

Difficulty: medium

Interview Round: Onsite

Design a team messaging platform similar to Slack that supports **multiple organizations (multi-tenancy)** and is **deployed globally**. ### Functional requirements - Users can belong to one or more **workspaces** (tenants/organizations). - Each workspace has multiple **channels** (public and private) and **direct messages (DMs)**. - Users can: - Send and receive real-time text messages in channels and DMs. - See message history in channels and DMs. - See basic presence (online/away) for other users in the same workspace. - Messages must be delivered with low latency (e.g., p95 < 200 ms) for active users. ### Non-functional & multi-tenant requirements - The service must support **millions of users** across **tens of thousands of workspaces**. - **Multi-tenancy**: - Strict data isolation between workspaces: users in one workspace must never see data from another workspace. - Different workspaces can have different configurations and limits (e.g., message retention, file size limits). - The system should defend against noisy neighbors (one tenant over-consuming shared resources). - **Global deployment**: - Users are geographically distributed (e.g., Americas, Europe, Asia). - Users should connect to a nearby region for good latency. - Many large organizations have employees in multiple regions in the **same workspace**. ### Design tasks Describe a design that covers at least the following aspects: 1. **API and high-level architecture** - Key services (e.g., gateway/API layer, auth, workspace/channel management, messaging, presence, search, notification). - How clients (web/desktop/mobile) connect to the system for real-time messaging (e.g., WebSockets, long polling). 2. **Data model and storage** - Core entities: `Workspace (Tenant)`, `User`, `Membership`, `Channel`, `Message`. - What storage technologies you would use for: - Metadata (users, workspaces, channels, memberships). - Messages and their history. - How you would **partition/shard** data to scale to many tenants and users. 3. **Multi-tenant architecture** - How you will represent tenant boundaries in the data model and APIs (e.g., `tenant_id` everywhere). - Options for physically storing tenant data: fully shared DB with a `tenant_id` column, separate DB per tenant, or a hybrid; discuss pros/cons. - How you enforce security and isolation across all layers (auth, services, storage). - Handling noisy neighbors (rate limiting, quotas, priority or dedicated resources for large tenants). 4. **Global deployment and replication** - How you would deploy the system into multiple regions. - How users get routed to the closest region (e.g., DNS, anycast, global load balancers). - How data for a single global workspace is handled when users are in multiple regions: - Where is the **source of truth** for messages of a workspace? - How are messages replicated across regions (e.g., asynchronous replication, regional caches)? - What consistency guarantees do you provide (e.g., eventual consistency across regions vs strong consistency within a region)? - Strategies for regional failover and disaster recovery. 5. **Scalability and performance** - How you would scale: - WebSocket / real-time connections. - Message fan-out to many subscribers in a busy channel. - Message storage and retrieval. - Caching strategies and indexing for recent history vs deep history. 6. **Other considerations** (at a high level) - Search and message indexing across channels in a workspace. - File attachments (storage and access controls) if you have time. - Security (encryption in transit/at rest, per-tenant encryption keys, audit logging). Explain the trade-offs you are making (e.g., consistency vs availability, shared vs isolated tenant storage) and justify your choices in terms of reliability, cost, and operational complexity.

Quick Answer: This question evaluates a candidate's ability to design large-scale, multi-tenant real-time messaging systems, testing competencies in distributed systems, data modeling, multi-tenancy, replication, latency optimization, and operational isolation.

Related Interview Questions

  • Design Video Generation Orchestration - OpenAI (medium)
  • Design CI/CD Build Caching - OpenAI
  • Design an Instagram-like Feed System - OpenAI (medium)
  • Design Online Chess Matchmaking - OpenAI (hard)
  • Design Android MVVM API Architecture - OpenAI (medium)
OpenAI logo
OpenAI
Dec 1, 2025, 12:00 AM
Software Engineer
Onsite
System Design
13
0
Loading...

Design a team messaging platform similar to Slack that supports multiple organizations (multi-tenancy) and is deployed globally.

Functional requirements

  • Users can belong to one or more workspaces (tenants/organizations).
  • Each workspace has multiple channels (public and private) and direct messages (DMs) .
  • Users can:
    • Send and receive real-time text messages in channels and DMs.
    • See message history in channels and DMs.
    • See basic presence (online/away) for other users in the same workspace.
  • Messages must be delivered with low latency (e.g., p95 < 200 ms) for active users.

Non-functional & multi-tenant requirements

  • The service must support millions of users across tens of thousands of workspaces .
  • Multi-tenancy :
    • Strict data isolation between workspaces: users in one workspace must never see data from another workspace.
    • Different workspaces can have different configurations and limits (e.g., message retention, file size limits).
    • The system should defend against noisy neighbors (one tenant over-consuming shared resources).
  • Global deployment :
    • Users are geographically distributed (e.g., Americas, Europe, Asia).
    • Users should connect to a nearby region for good latency.
    • Many large organizations have employees in multiple regions in the same workspace .

Design tasks

Describe a design that covers at least the following aspects:

  1. API and high-level architecture
    • Key services (e.g., gateway/API layer, auth, workspace/channel management, messaging, presence, search, notification).
    • How clients (web/desktop/mobile) connect to the system for real-time messaging (e.g., WebSockets, long polling).
  2. Data model and storage
    • Core entities: Workspace (Tenant) , User , Membership , Channel , Message .
    • What storage technologies you would use for:
      • Metadata (users, workspaces, channels, memberships).
      • Messages and their history.
    • How you would partition/shard data to scale to many tenants and users.
  3. Multi-tenant architecture
    • How you will represent tenant boundaries in the data model and APIs (e.g., tenant_id everywhere).
    • Options for physically storing tenant data: fully shared DB with a tenant_id column, separate DB per tenant, or a hybrid; discuss pros/cons.
    • How you enforce security and isolation across all layers (auth, services, storage).
    • Handling noisy neighbors (rate limiting, quotas, priority or dedicated resources for large tenants).
  4. Global deployment and replication
    • How you would deploy the system into multiple regions.
    • How users get routed to the closest region (e.g., DNS, anycast, global load balancers).
    • How data for a single global workspace is handled when users are in multiple regions:
      • Where is the source of truth for messages of a workspace?
      • How are messages replicated across regions (e.g., asynchronous replication, regional caches)?
      • What consistency guarantees do you provide (e.g., eventual consistency across regions vs strong consistency within a region)?
    • Strategies for regional failover and disaster recovery.
  5. Scalability and performance
    • How you would scale:
      • WebSocket / real-time connections.
      • Message fan-out to many subscribers in a busy channel.
      • Message storage and retrieval.
    • Caching strategies and indexing for recent history vs deep history.
  6. Other considerations (at a high level)
    • Search and message indexing across channels in a workspace.
    • File attachments (storage and access controls) if you have time.
    • Security (encryption in transit/at rest, per-tenant encryption keys, audit logging).

Explain the trade-offs you are making (e.g., consistency vs availability, shared vs isolated tenant storage) and justify your choices in terms of reliability, cost, and operational complexity.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.