Slack-Like Messaging Systems

What's being tested

Interviewers are probing whether you can design a real-time multi-tenant messaging system with clear data models, APIs, delivery semantics, storage strategy, and failure handling. A strong answer balances low-latency fanout, durable message history, permissions, search, notifications, and operational concerns without overbuilding every subsystem. OpenAI cares because many products involve collaborative, streaming, user-facing systems where correctness, privacy, latency, and graceful degradation all matter. The interviewer is not looking for “use WebSockets and Kafka” as a slogan; they want to see how you reason through tradeoffs like online vs offline delivery, channel fanout, ordering, tenant isolation, and backpressure.

Core knowledge

Core entities usually include User, Workspace, Channel, Membership, Message, Thread, Reaction, Attachment, and ReadReceipt. Model workspace-scoped IDs and permissions explicitly; multi-tenant systems fail when access checks are treated as an afterthought instead of part of every read/write path.
API design should separate durable writes from real-time delivery. Typical endpoints: POST /messages, GET /channels/{id}/messages?before=..., POST /channels/{id}/join, and a persistent connection endpoint like /realtime. POST /messages should return after persistence, not after every recipient receives the message.
Persistent connections are commonly implemented with WebSockets, Server-Sent Events, or long polling. WebSocket supports bidirectional events for typing indicators and presence; SSE is simpler for server-to-client streams. For mobile and unreliable networks, clients need reconnect tokens, heartbeats, and “resume from sequence number.”
Message durability belongs in a primary store such as DynamoDB, Cassandra, ScyllaDB, MySQL, or Postgres, depending on scale. A common schema is partition by channel_id and sort by message_ts or monotonically increasing message_id. Hot channels can overload a single partition, so consider bucketed partitions like (channel_id, day) or (channel_id, shard).
Ordering semantics should be stated precisely. Global total ordering is expensive and usually unnecessary; Slack-like systems commonly need per-channel ordering. Use server-assigned Snowflake-style IDs, ULID, or a sequencer per channel. If clients send messages concurrently, show optimistic local rendering but reconcile against server order.
Fanout strategy depends on channel size. For small channels, fanout-on-write pushes an event to each online member’s connection server and notification pipeline. For very large channels, fanout-on-read or hybrid fanout avoids writing millions of inbox rows. A useful threshold: direct messages and small groups fan out eagerly; channels with 100k+ members need pull-based consumption and pagination.
Real-time delivery architecture often uses connection gateways plus an internal event bus. A request service persists the message, publishes MessageCreated(channel_id, seq) to Kafka, Pulsar, Redis Streams, or NATS, and gateway servers subscribed to relevant channels deliver to connected clients. Gateways should be stateless except for ephemeral connection mappings.
Delivery guarantees should be practical: usually at-least-once delivery with client-side de-duplication by message_id. Exactly-once end-to-end is rarely worth claiming. Clients should maintain last_seen_seq per channel and call a history API to fill gaps after reconnect or missed events.
Presence and typing indicators are ephemeral, not durable messages. Store presence in Redis with TTLs and heartbeat updates, e.g., presence:user_id -> online until t. Avoid writing every typing event to durable storage; throttle events and treat them as best-effort to reduce load.
Read receipts and unread counts can be modeled as last_read_message_id per (user_id, channel_id). Unread count can be computed as messages after the marker for small channels, but at scale you may maintain counters or approximate badges. Be careful with edits, deletes, hidden messages, and per-user visibility.
Search indexing is a separate read path. Persist messages first, then asynchronously index into Elasticsearch, OpenSearch, or a dedicated search service. Search documents should include workspace, channel, sender, timestamp, permissions metadata, and tokenized content; results must be filtered by current membership and retention policy.
Security and compliance include authentication, authorization, tenant isolation, audit logs, encryption, retention, and deletion. Use workspace-scoped authorization checks on every message fetch and publish path. Encrypt in transit with TLS; encrypt at rest with managed keys, and discuss enterprise features like legal holds only at a high level unless prompted.

Worked example

For “Design a Slack-like messaging platform”, start by clarifying scope: “Are we designing team chat with workspaces, channels, DMs, message history, search, notifications, and presence? What scale should I assume: 10M daily users, 100k messages/sec peak, and p99 send-to-display under 500ms for online users?” Then declare your assumptions: per-channel ordering is required, offline users can catch up via history, and message persistence is the source of truth. Organize the answer around four pillars: data model, write/read APIs, real-time delivery, and storage/indexing/notifications. For the write path, say the client calls POST /messages, the message service validates membership, assigns message_id and channel sequence, writes to the message store, then publishes an event to an internal bus. For the read path, online clients receive events over WebSocket, while reconnecting clients use GET /messages?after_seq=... to fill gaps. For storage, use a channel-partitioned message table, but call out hot partitions for giant channels and propose bucketing or hybrid fanout. A concrete tradeoff to flag: fanout-on-write gives lower latency for small groups but explodes for large public channels, so use a hybrid strategy based on member count and online subscriber count. Close by saying: “If I had more time, I’d drill into search indexing, retention/deletion semantics, notification ranking, and operational metrics like p99 delivery latency, reconnect gap rate, and message send error rate.”

A second angle

For “Design an AI chatbot with browser storage”, the same messaging concepts apply, but the constraints shift toward client-side state, streaming, and privacy. Instead of multi-user channels and workspace permissions, the core entities are local conversations, messages, model responses, and session metadata stored in browser storage such as IndexedDB. Real-time delivery becomes token streaming from a backend relay using SSE or WebSocket, with the client appending partial assistant messages as chunks arrive. The main design decision is whether conversation history is purely local or synced to a server; browser-only storage improves privacy but complicates cross-device continuity, backup, and quota handling. You should also discuss not exposing provider API keys in the browser, using a stateless relay, and handling refresh/reconnect without duplicating assistant responses.

Common pitfalls

Pitfall: Jumping straight to Kafka and WebSockets without defining guarantees.

A weak answer lists technologies before explaining semantics. A better answer says, “We provide durable persistence before acknowledgement, at-least-once real-time delivery, client de-duplication by message_id, and history replay after reconnect,” then chooses tools that support those properties.

Pitfall: Treating all channels the same size.

Designs that fan out every message to every member work for DMs and small teams but collapse for huge announcement channels. Segment the problem: small channels get eager push, large channels get subscription-based delivery for online users and pull-based history for everyone else.

Pitfall: Ignoring authorization on read paths.

Many candidates remember to check membership on POST /messages but forget search, history pagination, attachments, notifications, and WebSocket subscriptions. Strong answers make authorization a cross-cutting invariant: every event and query is scoped by workspace, channel membership, retention policy, and user visibility.

Connections

Interviewers may pivot from this into notification systems, search indexing, distributed ID generation, rate limiting, or multi-tenant authorization. They may also ask you to zoom into client behavior: offline sync, local caching, optimistic UI, retry logic, and streaming responses for AI chat interfaces.

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Featured in interview prep guides

Practice questions

Related concepts