You are asked to design the backend for a real-time collaborative document editor similar to Google Docs. The interviewer is specifically interested in how you handle parallel (concurrent) editing of the same document by multiple users.
Design and explain a system that supports the following:
-
Many users can open the same document and edit it at the same time.
-
Each user should see other users' changes in near real-time (e.g., within a few hundred milliseconds).
-
The document should remain logically consistent: no characters are randomly lost or duplicated, and all users eventually converge to the same content.
-
The system should be resilient to network latency and temporary disconnections.
In your answer, cover:
-
High-level architecture
-
How clients connect to the service and receive real-time updates.
-
The main backend services involved and storage choices for documents and edit history.
-
Data model for edits
-
How you represent document content (e.g., as a string, sequence of operations, or a more complex data structure).
-
How individual editing operations (insert, delete, formatting changes) are modeled.
-
Handling concurrent edits
(core part)
-
How you serialize or merge edits coming from multiple users.
-
Discuss at least one of the following approaches in some depth:
-
Centralized global sequence of operations assigned by the server.
-
Operational Transformation (OT).
-
Conflict-Free Replicated Data Types (CRDTs).
-
How your chosen approach ensures that all users eventually see the same document, even when operations arrive out of order due to network delays.
-
Consistency and performance trade-offs
-
What kind of consistency guarantees you provide (e.g., eventual vs. strong consistency at the document level).
-
How you keep latency low while avoiding conflicts.
-
Scalability and reliability
(briefly)
-
How you scale to millions of documents and thousands of concurrent users per document.
-
How you handle server failures and recovery without losing edits.
Assume a typical web/mobile client using WebSockets or similar technology is acceptable. Focus emphasis on the design of the collaboration/concurrency mechanism rather than on UI.