PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Airbnb

Design a Group Chat / Messaging System

Last updated: Jun 24, 2026

Quick Overview

This question evaluates a candidate's ability to architect a real-time distributed messaging system, testing competency in data modeling, storage partitioning, and push delivery at scale. It covers core system design concepts including WebSocket-based fan-out, presence registries, and trade-offs between durability and low-latency delivery commonly assessed in senior software engineering interviews.

  • hard
  • Airbnb
  • System Design
  • Software Engineer

Design a Group Chat / Messaging System

Company: Airbnb

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

# Design a Group Chat / Messaging System Design the backend for a real-time **group chat** feature embedded in a large consumer product (think of the in-app messaging between guests and hosts, including multi-party group conversations). Users can create conversations with one or more participants, send text messages, and see messages from others appear in near real time. Each participant should see an accurate unread-message count and be able to scroll back through the full history of any conversation. ```hint Where to start Separate the two core sub-problems early: (1) **durable storage + retrieval** of the message log, and (2) **real-time fan-out** of a new message to the other online participants. They have very different read/write characteristics and usually use different infrastructure (a database vs. a push/gateway layer). ``` ```hint Data modeling Model a chat as three entities: `conversation`, `conversation_member` (the participant set, which makes group chat work), and `message`. Partition/shard the `message` table by `conversation_id` so that "load the latest N messages in this conversation" is a single-partition range scan ordered by a monotonic message id or timestamp. ``` ```hint Real-time delivery A WebSocket gateway holds the persistent connections. The hard part is routing: when a message lands, you must find which gateway nodes currently hold connections for the other members. A presence/session registry (e.g. user → gateway-node mapping in a fast store) plus a pub/sub channel keyed by `conversation_id` is the usual decomposition. ``` ### Constraints & Assumptions State your own numbers, but a reasonable target for this exercise: - ~50M monthly active users, ~5M concurrent online users at peak. - ~1B messages/day (~12K writes/sec average, assume 5x peak ≈ 60K writes/sec). - Group conversations are small-to-medium: up to ~100 members per conversation (not broadcast/Discord-scale of 100K+). - Read-heavy: opening a conversation reads the latest ~30 messages; history scroll is paginated. - Messages are text plus optional small attachments (image/file by reference); delivery should feel real-time (sub-second p95) for online users. - Messages must be durable (never silently lost) and ordered consistently within a conversation. ### Clarifying Questions to Ask - What is the maximum group size — bounded small groups (≤100) or broadcast-scale channels? This decides whether naive fan-out is acceptable or whether we need a fan-out-on-read model. - Do we need delivery/read receipts and typing indicators, or just message send + unread counts? - What ordering and consistency guarantees are required — is per-conversation total order enough, or do we need cross-conversation global ordering (almost never)? - What is the message retention policy (forever vs. TTL), and are edits/deletions/recall in scope? - Is end-to-end encryption required, or is transport + at-rest encryption sufficient? - What platforms must we support (mobile with offline sync + push notifications, web), since offline delivery changes the design substantially? ### Part 1 — Data Model and Write Path Define the data model and walk through what happens when a user sends a message: how it is persisted, how ordering is guaranteed within a conversation, and how the system acknowledges the sender. ```hint Ordering Don't rely on wall-clock timestamps for ordering (clock skew, ties). Assign each message a per-conversation monotonically increasing sequence number, or use a server-generated time-ordered id (e.g. a Snowflake-style id) so the message log is totally ordered within a conversation. ``` ```hint Storage engine This is a high-write, append-heavy, range-scan-by-conversation workload — a wide-column store (Cassandra/HBase/DynamoDB/Bigtable) with partition key `conversation_id` and clustering key `message_seq` fits well. A relational DB works at smaller scale but you must shard by `conversation_id`. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 2 — Real-Time Fan-Out and Presence Design how a newly persisted message reaches the other **online** participants in near real time. Cover the connection layer, how you locate the recipients' connections, and how this scales to millions of concurrent connections. ```hint Connection layer Clients hold a persistent WebSocket to a fleet of stateless gateway nodes behind an L4 load balancer. Gateways are dumb pipes; the routing intelligence lives in a presence registry plus a message bus. ``` ```hint Routing the message Maintain a presence store mapping `user_id → {gateway_node, connection_id}` (e.g. in Redis). On a new message, look up each member's current node and deliver via a pub/sub topic per gateway node (or per conversation). This avoids every gateway subscribing to every conversation. ``` #### Clarifying Questions for this Part - For a 100-member group where 80 are offline, do offline members get a push notification + sync-on-reconnect, or is real-time delivery only a best-effort for online users? #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 3 — Unread Counts, History Pagination, and Read Path Design how each user gets an accurate **unread count** per conversation and how clients page through message **history** efficiently. Discuss the read-path caching and how unread counts stay correct as messages are read. ```hint Unread counts Store a per-(user, conversation) "last read sequence number." Unread = (latest conversation seq − last-read seq). This is cheap to compute and avoids per-message read-state rows. A user's conversation-list badge is the sum of unread across their conversations, which you can cache and update incrementally. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### What a Strong Answer Covers ```premium-lock What a Strong Answer Covers ``` ### Follow-up Questions - How do you guarantee a message is never lost if the server crashes after acking the client but before fan-out completes? (Discuss the write-ahead/durable-log ordering and sync-on-reconnect.) - How would the design change for **broadcast-scale** channels (100K+ members) where fan-out-on-write becomes infeasible? - How do you implement message edit, delete, and "unsend/recall" while keeping every participant's view consistent? - Where would you place a database index, and what is the cost of indexing on the hot write path versus serving reads from the conversation partition key alone?

Quick Answer: This question evaluates a candidate's ability to architect a real-time distributed messaging system, testing competency in data modeling, storage partitioning, and push delivery at scale. It covers core system design concepts including WebSocket-based fan-out, presence registries, and trade-offs between durability and low-latency delivery commonly assessed in senior software engineering interviews.

Related Interview Questions

  • Design a Scalable Job Scheduler - Airbnb
  • Design a Rental Marketplace Backend - Airbnb (hard)
  • Design a booking system - Airbnb (medium)
  • Design a group chat system - Airbnb (medium)
  • Design a real-time chat system with hot groups - Airbnb (hard)
Airbnb logo
Airbnb
Mar 4, 2026, 12:00 AM
Software Engineer
Onsite
System Design
0
0

Design a Group Chat / Messaging System

Design the backend for a real-time group chat feature embedded in a large consumer product (think of the in-app messaging between guests and hosts, including multi-party group conversations). Users can create conversations with one or more participants, send text messages, and see messages from others appear in near real time. Each participant should see an accurate unread-message count and be able to scroll back through the full history of any conversation.

Constraints & Assumptions

State your own numbers, but a reasonable target for this exercise:

  • ~50M monthly active users, ~5M concurrent online users at peak.
  • ~1B messages/day (~12K writes/sec average, assume 5x peak ≈ 60K writes/sec).
  • Group conversations are small-to-medium: up to ~100 members per conversation (not broadcast/Discord-scale of 100K+).
  • Read-heavy: opening a conversation reads the latest ~30 messages; history scroll is paginated.
  • Messages are text plus optional small attachments (image/file by reference); delivery should feel real-time (sub-second p95) for online users.
  • Messages must be durable (never silently lost) and ordered consistently within a conversation.

Clarifying Questions to Ask

  • What is the maximum group size — bounded small groups (≤100) or broadcast-scale channels? This decides whether naive fan-out is acceptable or whether we need a fan-out-on-read model.
  • Do we need delivery/read receipts and typing indicators, or just message send + unread counts?
  • What ordering and consistency guarantees are required — is per-conversation total order enough, or do we need cross-conversation global ordering (almost never)?
  • What is the message retention policy (forever vs. TTL), and are edits/deletions/recall in scope?
  • Is end-to-end encryption required, or is transport + at-rest encryption sufficient?
  • What platforms must we support (mobile with offline sync + push notifications, web), since offline delivery changes the design substantially?

Part 1 — Data Model and Write Path

Define the data model and walk through what happens when a user sends a message: how it is persisted, how ordering is guaranteed within a conversation, and how the system acknowledges the sender.

What This Part Should Cover Premium

Part 2 — Real-Time Fan-Out and Presence

Design how a newly persisted message reaches the other online participants in near real time. Cover the connection layer, how you locate the recipients' connections, and how this scales to millions of concurrent connections.

Clarifying Questions for this Part

  • For a 100-member group where 80 are offline, do offline members get a push notification + sync-on-reconnect, or is real-time delivery only a best-effort for online users?

What This Part Should Cover Premium

Part 3 — Unread Counts, History Pagination, and Read Path

Design how each user gets an accurate unread count per conversation and how clients page through message history efficiently. Discuss the read-path caching and how unread counts stay correct as messages are read.

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

  • How do you guarantee a message is never lost if the server crashes after acking the client but before fan-out completes? (Discuss the write-ahead/durable-log ordering and sync-on-reconnect.)
  • How would the design change for broadcast-scale channels (100K+ members) where fan-out-on-write becomes infeasible?
  • How do you implement message edit, delete, and "unsend/recall" while keeping every participant's view consistent?
  • Where would you place a database index, and what is the cost of indexing on the hot write path versus serving reads from the conversation partition key alone?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Airbnb•More Software Engineer•Airbnb Software Engineer•Airbnb System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.