Design a Synchronized Watch Party System (LLD)
Company: Salesforce
Role: Software Engineer
Category: System Design
Interview Round: Onsite
Design a **Watch Party** system that lets a host create a virtual room and have multiple users join — via a unique Room ID — to watch the same video content together with their players kept in sync.
The system must support real-time playback controls (Play, Pause, Seek, and Change Playback Speed) issued from the room, and it must guarantee that every participant sees a consistent, synchronized playback state. This is a **low-level / object-oriented design** question: the interviewer cares about your class model, the public API/methods, the synchronization protocol, and how you handle real-world timing and failure edge cases. You are not expected to design the video CDN or the encoding pipeline.
### Constraints & Assumptions
- A single room typically has **2–50 participants**; the system overall may host **tens of thousands of concurrent rooms**.
- Each participant runs a client player that can `play`, `pause`, `seek(positionMs)`, and `setPlaybackRate(rate)` locally; the client receives commands and reports/applies state.
- Target end-to-end sync drift between any two participants is on the order of **a few hundred milliseconds**.
- Communication uses a persistent bidirectional channel (e.g., WebSocket); assume the media itself is fetched by each client independently from a CDN.
- Clocks on clients are unreliable and may drift; assume access to a server timestamp.
- Treat this primarily as in-memory session state per room; durable storage of room metadata is secondary.
### Clarifying Questions to Ask
- Who is allowed to issue playback controls — only the host, or any participant (and can the host delegate control)?
- Is the content the same fixed media for everyone, or can the host change the video mid-session?
- What is the expected tolerance for sync drift, and is occasional re-sync (a small jump) acceptable to correct drift?
- Do we need to support chat, reactions, or presence, or strictly playback synchronization?
- What happens to the room when the host disconnects — does it pause, end, or transfer host?
- Do we need persistence/replay (resume a room later) or is the session purely ephemeral?
### Part 1 — Domain model and public API
Define the core classes/objects and their public methods. At minimum, model the room, the participants, the host role, and the playback state, and specify the operations a client and the server expose to drive and observe playback.
```hint Where to start
Separate **session/identity** entities (Room, Participant, Host) from the **state** object (PlaybackState). Make `PlaybackState` a small immutable-ish value: `status` (PLAYING/PAUSED), an anchor `positionMs`, a `playbackRate`, and the **server timestamp** at which that position was true.
```
```hint Encode time, not just position
Storing a raw `positionMs` goes stale instantly while playing. Store `(anchorPositionMs, anchorServerTimeMs, rate, status)` so any client can compute the *current* expected position as `anchorPositionMs + (now - anchorServerTimeMs) * rate` while PLAYING.
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 2 — Real-time synchronization protocol
Describe how a control command (say, the host hits Pause, or Seeks to 10:00) propagates so that all participants converge to the same playback state, accounting for variable network latency and client clock drift.
```hint Server as source of truth
Route every command through the server, which stamps it with an authoritative server time and rebroadcasts the new `PlaybackState` to all participants. Clients reconcile their local player to the server state rather than trusting each other peer-to-peer.
```
```hint Latency + clock handling
Have each client estimate its offset to server time (a lightweight ping/NTP-style exchange). On receiving a PLAYING state, the client computes the target position for *its* current time and either seeks (if drift is large) or nudges `playbackRate` slightly to ease back into sync (if drift is small).
```
#### Clarifying Questions for this Part
- Is strict consistency required (everyone pauses at the exact same frame) or is eventual convergence within a small window acceptable?
- Can the server reject or serialize conflicting commands issued nearly simultaneously?
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 3 — Edge cases and failure handling
Walk through the important edge cases: a participant joining late (mid-playback), the host disconnecting, conflicting near-simultaneous commands, a slow/buffering client, and a participant whose connection drops and reconnects.
```hint Late joiners and reconnects
On join (or reconnect), the server sends the current `PlaybackState` snapshot plus its version; the client computes the live position from the anchor timestamp and seeks there before resuming — no special replay needed.
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### What a Strong Answer Covers
```premium-lock What a Strong Answer Covers
```
### Follow-up Questions
- How would you scale to hundreds of thousands of concurrent rooms across multiple servers? How do you route a room's participants to the same authoritative node, and what happens on failover?
- If you allowed any participant (not just the host) to issue commands, how would you prevent "control fights" and ensure a deterministic ordering?
- How would you add synchronized chat and emoji reactions without letting them interfere with playback-state ordering?
- How would you measure and alert on sync quality in production (what metric tells you participants are drifting apart)?
Quick Answer: This question evaluates a candidate's ability to design a low-level, object-oriented system for real-time state synchronization across multiple clients. It tests object/class modeling, API design, and reasoning about network latency, clock drift, and failure handling, commonly probed in system design interviews to gauge practical engineering judgment beyond theoretical knowledge.