PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Uber

Design a Real-Time Chat System

Last updated: Jun 24, 2026

Quick Overview

This system design question tests a candidate's ability to architect a scalable real-time messaging platform, focusing on transport protocol selection and the trade-offs between live delivery and durable offline catch-up. It evaluates practical knowledge of distributed systems concepts including WebSocket vs. polling strategies, stateful connection management, and message persistence at scale.

  • medium
  • Uber
  • System Design
  • Software Engineer

Design a Real-Time Chat System

Company: Uber

Role: Software Engineer

Category: System Design

Difficulty: medium

Interview Round: Onsite

## Design a Real-Time Chat System Design the backend for a real-time one-to-one and group messaging application (think a 1:1 and small-group chat product like a WhatsApp- or Slack-style messenger). Users can send text messages to other users or to a group, see messages appear in near real time, see delivery and read state, and read their full message history when they come back online — including any messages that arrived while they were disconnected. The interview deliberately drills into the **communication layer**: what network protocol and transport you use between client and server to push messages in real time, and why you chose it over the alternatives. Be prepared to justify the transport choice, how you maintain the live connection, and how a sender's message reaches an offline recipient who later reconnects. ```hint Where to start Separate the two hard sub-problems: (1) the *real-time delivery* path — how a message gets pushed to an online recipient with low latency — and (2) the *durability / catch-up* path — how messages are stored so an offline recipient can fetch what they missed. They are answered by different components. ``` ```hint Transport choice The interviewer wants the trade-off between a long-lived bidirectional connection (e.g. **WebSocket**) and request/response polling (short polling, long polling, SSE). Anchor your answer in: server-initiated push, latency, connection overhead, and how each behaves through proxies/load balancers and on mobile radios. ``` ```hint Offline delivery An online recipient is reached over their live connection; an offline one is not. You need a per-user durable inbox or a message log keyed by conversation plus a "last delivered/last read" cursor, so a reconnecting client can pull the gap. Think about which datastore gives you cheap append + range-read by conversation and time. ``` ### Constraints & Assumptions State and defend your own numbers; reasonable working assumptions for this exercise: - ~50M daily active users; a few hundred thousand to ~1M concurrent live connections at peak. - Each user sends on the order of tens of messages per day; system peak on the order of ~100k messages/second. - Messages are small (text, a few hundred bytes); media is uploaded out-of-band to blob storage and only a reference travels through chat. - Target end-to-end delivery latency for online users in the low hundreds of milliseconds (p99). - Messages must be durable and ordered within a conversation; no message may be silently lost. - Groups are small-to-medium (up to a few hundred members), not broadcast-scale channels. - Single-region reasoning is acceptable as a baseline; note what changes for multi-region. ### Clarifying Questions to Ask - What is the read/write balance and the target latency — is this optimized for live delivery, history fetch, or both equally? - Do we need 1:1 only, or also group chat, and how large can a group get? (Fan-out cost scales with group size.) - What delivery semantics are required — at-least-once with client-side dedup, or exactly-once? Do we need ordering guarantees per conversation? - Which presence/receipt features are in scope: online/last-seen, "delivered", "read", and typing indicators? - Is end-to-end encryption a requirement, or is transport-layer (TLS) encryption with server-side storage acceptable? - What is the device model — one device per user, or multiple devices that must all stay in sync? ### What a Strong Answer Covers ```premium-lock What a Strong Answer Covers ``` ### Follow-up Questions - Walk through exactly what happens, component by component, when user A sends a message to user B while B is offline and then B reconnects 10 minutes later. - The interviewer pushed on the transport: defend WebSocket over HTTP long polling and over Server-Sent Events for this workload. When would you actually prefer SSE or long polling? - How do you guarantee per-conversation ordering and deduplicate retried sends so a flaky mobile client never shows a message twice or out of order? - How does the design change for a group with 200 members — do you fan out on write or on read, and what is the cost of each? - How do you scale the stateful connection tier and route a message to the specific server that currently holds the recipient's socket?

Quick Answer: This system design question tests a candidate's ability to architect a scalable real-time messaging platform, focusing on transport protocol selection and the trade-offs between live delivery and durable offline catch-up. It evaluates practical knowledge of distributed systems concepts including WebSocket vs. polling strategies, stateful connection management, and message persistence at scale.

Related Interview Questions

  • Design a Food-Delivery Backend (Uber Eats-style) - Uber (medium)
  • Design a Distributed Logging System - Uber (medium)
  • Design a Stock Trading Platform - Uber (medium)
  • Design an Uber Eats Cart Service - Uber (medium)
  • Design A URL Shortener - Uber (medium)
Uber logo
Uber
Jun 19, 2026, 12:00 AM
Software Engineer
Onsite
System Design
0
0

Design a Real-Time Chat System

Design the backend for a real-time one-to-one and group messaging application (think a 1:1 and small-group chat product like a WhatsApp- or Slack-style messenger). Users can send text messages to other users or to a group, see messages appear in near real time, see delivery and read state, and read their full message history when they come back online — including any messages that arrived while they were disconnected.

The interview deliberately drills into the communication layer: what network protocol and transport you use between client and server to push messages in real time, and why you chose it over the alternatives. Be prepared to justify the transport choice, how you maintain the live connection, and how a sender's message reaches an offline recipient who later reconnects.

Constraints & Assumptions

State and defend your own numbers; reasonable working assumptions for this exercise:

  • ~50M daily active users; a few hundred thousand to ~1M concurrent live connections at peak.
  • Each user sends on the order of tens of messages per day; system peak on the order of ~100k messages/second.
  • Messages are small (text, a few hundred bytes); media is uploaded out-of-band to blob storage and only a reference travels through chat.
  • Target end-to-end delivery latency for online users in the low hundreds of milliseconds (p99).
  • Messages must be durable and ordered within a conversation; no message may be silently lost.
  • Groups are small-to-medium (up to a few hundred members), not broadcast-scale channels.
  • Single-region reasoning is acceptable as a baseline; note what changes for multi-region.

Clarifying Questions to Ask

  • What is the read/write balance and the target latency — is this optimized for live delivery, history fetch, or both equally?
  • Do we need 1:1 only, or also group chat, and how large can a group get? (Fan-out cost scales with group size.)
  • What delivery semantics are required — at-least-once with client-side dedup, or exactly-once? Do we need ordering guarantees per conversation?
  • Which presence/receipt features are in scope: online/last-seen, "delivered", "read", and typing indicators?
  • Is end-to-end encryption a requirement, or is transport-layer (TLS) encryption with server-side storage acceptable?
  • What is the device model — one device per user, or multiple devices that must all stay in sync?

What a Strong Answer Covers Premium

Follow-up Questions

  • Walk through exactly what happens, component by component, when user A sends a message to user B while B is offline and then B reconnects 10 minutes later.
  • The interviewer pushed on the transport: defend WebSocket over HTTP long polling and over Server-Sent Events for this workload. When would you actually prefer SSE or long polling?
  • How do you guarantee per-conversation ordering and deduplicate retried sends so a flaky mobile client never shows a message twice or out of order?
  • How does the design change for a group with 200 members — do you fan out on write or on read, and what is the cost of each?
  • How do you scale the stateful connection tier and route a message to the specific server that currently holds the recipient's socket?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Uber•More Software Engineer•Uber Software Engineer•Uber System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.