How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a medium difficulty System Design question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at OpenAI during technical interviews.

Design a Slack-Like Messaging System | OpenAI Interview Question

Q: Design a Slack-Like Messaging System

This System Design question evaluates a candidate's ability to architect a scalable, durable real-time messaging system, testing skills in distributed systems, data modeling, API design, persistence versus low-latency delivery, ordering guarantees, missed-message recovery, notifications, and operational scaling.

Design a Slack-like team messaging system focused on sending and receiving messages in real time. Your design should support workspaces with channels and direct messages, deliver new messages to online clients with low latency, and let a client that was disconnected (or freshly launched) recover everything it missed without dropping or duplicating messages.

Walk through the core data model, the send/receive APIs, the real-time delivery path, missed-message recovery, per-conversation ordering, notifications, cold start, and how the design scales. Be explicit about where durability lives versus where the real-time path is just an optimization.

Constraints & Assumptions

Scale target: on the order of $10$ M daily active users, $10$ M+ workspaces, and large public channels with up to $\sim100$ K members. Peak send traffic on the order of $100$ K messages/sec across the fleet.
Latency: new messages should reach an online recipient in well under $1$ second end to end.
Durability: an accepted message must never be lost. Once the server acknowledges a send, the message is recoverable even if every WebSocket drops.
Ordering: ordering is required within a single conversation (channel or DM). Global ordering across the whole product is not required.
Clients: web, desktop, and mobile; a single user may be connected from several devices at once, and devices go offline frequently. Assume read/write traffic is read-heavy (history loads, scrollback) relative to writes.
Out of scope: voice/video calls, file storage internals, search ranking, message editing/threading (mention them only as extensions).

The Problem

Produce a complete design that addresses the following. Treat each as a dimension the interviewer will probe, and make the durability-versus-delivery boundary explicit throughout.

Core entities — users, workspaces, channels, direct messages, memberships, and messages, plus the keys/IDs that tie them together.
Send & receive APIs — the write path for sending a message and the read paths for live delivery and history.
Real-time delivery — persistent connections (e.g. WebSockets), a gateway that tracks connections, and how a new message reaches connected members.
Missed-message recovery — how a client that was disconnected catches up on reconnect without gaps or duplicates.
Ordering guarantees — what ordering you promise and the mechanism that enforces it.
Notifications — when to push vs. suppress, and the latency/correctness/UX tradeoffs.
Cold start — what happens when a user opens the app after being offline (or on a new device) so it loads fast without downloading everything.
Storage, queues, fanout, and scaling — data stores, the event bus, fanout strategy, and the bottlenecks at the scale above.

Clarifying Questions to Ask

What ordering semantics are required — strict per-conversation order, or is best-effort with client-side reordering acceptable?
What is the read/write ratio, and how large can a single channel get (10s vs. 100K members)?
Do we need exactly-once display (dedup on the client) or is at-least-once delivery with client dedup fine?
How many simultaneous devices per user, and must all devices stay consistent (read cursors, unread counts)?
What are the latency and durability SLAs, and is any cross-region/geo-replication requirement in scope?
Are threads, edits, reactions, and presence in scope now, or future extensions?

What a Strong Answer Covers

A clear data model with a justified partition key and an ordering key, and reasoning for why those choices fit the access patterns.
An explicit, defended position on the relationship between durability and delivery — which one must happen first, and why that ordering is load-bearing.
A send path and a separate live-delivery path, with a connection-tracking tier that knows where each user/device is connected.
A correct, gap-free missed-message recovery scheme, including how the client efficiently discovers which of many conversations changed before pulling each one.
A concrete ordering mechanism, plus an honest discussion of the bottleneck or failure mode that mechanism introduces and how to mitigate it.
A fanout strategy that distinguishes small conversations from very large channels, and reasons about cost as a function of membership and connectivity.
Notifications modeled as a separate, user-state-aware concern, with stated latency / correctness / UX tradeoffs.
A cold-start flow that prioritizes what the user sees first and defers the rest, rather than downloading everything.
Failure handling: how duplicates are suppressed, how gaps are detected and repaired, how retries stay idempotent, and what happens when each component fails.
A sense of scale: where the bottlenecks are and how the chosen partitioning/caching addresses them.

Follow-up Questions

How do you keep multiple devices for the same user consistent on read cursors and unread counts?
How would you add message edits and deletes while preserving the ordering and recovery model based on sequence numbers?
How do you guarantee a client never permanently misses a message if it disconnects in the middle of a fanout — and how does it detect a sequence gap?
How would you extend this to support threaded replies, or reactions, without breaking per-conversation ordering?
How would you handle a single extremely hot channel (e.g. a company-wide announcement channel) where the single-sequencer approach becomes a bottleneck?

Constraints & Assumptions

Scale target: on the order of $10$ M daily active users, $10$ M+ workspaces, and large public channels with up to $\sim100$ K members. Peak send traffic on the order of $100$ K messages/sec across the fleet.
Latency: new messages should reach an online recipient in well under $1$ second end to end.
Durability: an accepted message must never be lost. Once the server acknowledges a send, the message is recoverable even if every WebSocket drops.
Ordering: ordering is required within a single conversation (channel or DM). Global ordering across the whole product is not required.
Clients: web, desktop, and mobile; a single user may be connected from several devices at once, and devices go offline frequently. Assume read/write traffic is read-heavy (history loads, scrollback) relative to writes.
Out of scope: voice/video calls, file storage internals, search ranking, message editing/threading (mention them only as extensions).

The Problem

Produce a complete design that addresses the following. Treat each as a dimension the interviewer will probe, and make the durability-versus-delivery boundary explicit throughout.

Core entities — users, workspaces, channels, direct messages, memberships, and messages, plus the keys/IDs that tie them together.
Send & receive APIs — the write path for sending a message and the read paths for live delivery and history.
Real-time delivery — persistent connections (e.g. WebSockets), a gateway that tracks connections, and how a new message reaches connected members.
Missed-message recovery — how a client that was disconnected catches up on reconnect without gaps or duplicates.
Ordering guarantees — what ordering you promise and the mechanism that enforces it.
Notifications — when to push vs. suppress, and the latency/correctness/UX tradeoffs.
Cold start — what happens when a user opens the app after being offline (or on a new device) so it loads fast without downloading everything.
Storage, queues, fanout, and scaling — data stores, the event bus, fanout strategy, and the bottlenecks at the scale above.

Clarifying Questions to Ask

What ordering semantics are required — strict per-conversation order, or is best-effort with client-side reordering acceptable?
What is the read/write ratio, and how large can a single channel get (10s vs. 100K members)?
Do we need exactly-once display (dedup on the client) or is at-least-once delivery with client dedup fine?
How many simultaneous devices per user, and must all devices stay consistent (read cursors, unread counts)?
What are the latency and durability SLAs, and is any cross-region/geo-replication requirement in scope?
Are threads, edits, reactions, and presence in scope now, or future extensions?

What a Strong Answer Covers

A clear data model with a justified partition key and an ordering key, and reasoning for why those choices fit the access patterns.
An explicit, defended position on the relationship between durability and delivery — which one must happen first, and why that ordering is load-bearing.
A send path and a separate live-delivery path, with a connection-tracking tier that knows where each user/device is connected.
A correct, gap-free missed-message recovery scheme, including how the client efficiently discovers which of many conversations changed before pulling each one.
A concrete ordering mechanism, plus an honest discussion of the bottleneck or failure mode that mechanism introduces and how to mitigate it.
A fanout strategy that distinguishes small conversations from very large channels, and reasons about cost as a function of membership and connectivity.
Notifications modeled as a separate, user-state-aware concern, with stated latency / correctness / UX tradeoffs.
A cold-start flow that prioritizes what the user sees first and defers the rest, rather than downloading everything.
Failure handling: how duplicates are suppressed, how gaps are detected and repaired, how retries stay idempotent, and what happens when each component fails.
A sense of scale: where the bottlenecks are and how the chosen partitioning/caching addresses them.

Follow-up Questions

How do you keep multiple devices for the same user consistent on read cursors and unread counts?
How would you add message edits and deletes while preserving the ordering and recovery model based on sequence numbers?
How do you guarantee a client never permanently misses a message if it disconnects in the middle of a fanout — and how does it detect a sequence gap?
How would you extend this to support threaded replies, or reactions, without breaking per-conversation ordering?
How would you handle a single extremely hot channel (e.g. a company-wide announcement channel) where the single-sequencer approach becomes a bottleneck?

Design a Slack-Like Messaging System

Quick Overview

Constraints & Assumptions

The Problem

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP

Design a Slack-Like Messaging System

Quick Overview

Constraints & Assumptions

The Problem

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP