Adobe Creative Cloud Real-Time Collaboration And Offline Sync
Asked of: Software Engineer
Last updated

What's being tested
Interviewers probe your ability to design a low-latency, scalable collaboration system that also supports robust offline sync and recovery. They expect concrete distributed-systems tradeoffs: replication/consistency choices, conflict-resolution algorithms, client sync protocols, storage/compaction strategies, and how the design meets latency, bandwidth, and operational constraints at Adobe scale.
Core knowledge
-
CRDT (Conflict-free Replicated Data Types) vs Operational Transformation (OT): CRDTs guarantee eventual convergence without coordination, useful for offline-first clients; OT often needs a central transform server for intention preservation and lower metadata but complicates offline merging.
-
Op-based vs state-based delta replication: op-based (send operations) minimizes bandwidth for small ops but requires causal delivery; state/delta-state CRDTs send compacted deltas and help with late joiners. Memory and metadata per-document grow as O(#ops) unless you snapshot/compact.
-
Causality & metadata: use vector clocks or compact version vectors to detect concurrent operations; Lamport timestamps can order events but don't detect concurrency. Metadata size is typically O(#replicas) unless you use dotted-version vectors or summarized clocks.
-
Client architecture: local-first store + change log: store edits locally in
IndexedDB/SQLite, expose optimistic updates, and maintain a durable operation log (changepack) with checkpointing (last-acked sequence number) for efficient resync. -
Sync protocol: clients send a changepack and a checkpoint (e.g., last-seen-op-id); server replies with missing ops or a snapshot diff. Bandwidth ≈ ops/sec * avg_op_size; for estimation use where is serialized op size.
-
Real-time transport & scale: use
WebSocket/gRPCfor low latency; scale hundreds of millions of connections via connection gateways, sticky routing, and per-document channels. For fan-out, prefer ephemeral pub/sub (e.g.,Redisstreams, custom gossip) over heavy-useKafkafor sub-100ms delivery. -
Persistence strategy: event-sourcing (append-only op-store) + periodic snapshots for fast recovery. Compaction (merge ops into snapshot and GC tombstones) reduces storage and read latency but requires consistent snapshotting.
-
Large assets and partial replication: large binary files (images, PSD layers) should be stored in object storage (e.g.,
S3) with metadata/annotations as CRDTs; sync metadata only and fetch blobs on demand to avoid full-file transfer during edits. -
Conflict resolution patterns: merge policies (last-writer-wins, CRDT merge, application-specific merge functions), tombstone handling, and undo/redo maintenance. Choose policies based on user expectations (intention preservation vs convergence).
-
Operational concerns: idempotency (
idempotency-key), retry/backoff with jitter, rate-limiting,p99latency SLOs, and observability (ops/sec, op-lag, divergence rate). For long offline windows, expect large changepacks and plan incremental checkpoints.
Worked example — "Design a real-time collaboration system for Creative Cloud with offline sync"
First 30s: clarify scale (concurrent editors per document, total docs), offline window (minutes, hours, days), strong vs eventual consistency needs, and what parts must merge automatically (text/annotations) vs require manual conflict resolution (binary image edits). Skeleton answer pillars: (1) Client local-first model with durable change log in IndexedDB and optimistic UI; (2) Sync protocol using op-based CRDTs with checkpoints and causal metadata; (3) Real-time layer with WebSocket pub/sub for ops and presence; (4) Server storage combining event store + periodic snapshots + object store for blobs. Key tradeoff: pick CRDT for offline convergence and straightforward client merging vs OT for potentially smaller metadata but more server complexity; explicitly state metadata growth and plan compaction/snapshots to bound storage. Close by listing tests (automated property tests for convergence), metrics (op-applied-lag, divergence incidents), and follow-ups: "If I had more time I'd prototype op sizes and run network simulation for 24-hour offline resyncs."
A second angle — "Support collaborative annotations and offline sync for very large PSD files"
Same core concept but different constraints: large binaries make op-granularity on pixels impractical. Use chunked file storage with immutable blobs in S3, and surface a CRDT-managed metadata layer for annotations, layer ordering, and selection. For heavy edits (e.g., filter apply), use server-side transforms with a short-lived lease or optimistic locking to avoid expensive merges. Offline clients sync annotation ops cheaply and fetch or upload updated blobs asynchronously. Explicitly trade immediate local preview (generate low-res proxies client-side) against bandwidth and latency.
Common pitfalls
Pitfall: Designing for strict strong consistency by default. Strong consistency across global clients requires synchronous coordination and kills offline experience; articulate when you’ll accept eventual convergence versus when you require locks or transactions.
Pitfall: Ignoring metadata growth. Naively storing every operation metadata leads to unbounded storage and slow sync; propose snapshotting, tombstone compaction, and periodic delta checkpoints.
Pitfall: Not defining clear conflict semantics. Saying "we merge conflicts" is insufficient — give concrete merge policies for text, layers, annotations, and large binaries, and explain user-visible outcomes (automatic merge vs conflict resolution UI).
Connections
Interviewers may pivot to related areas: designing presence and cursor scalability, consistency models and consensus (Raft/Paxos) where strong coordination is needed, or client reliability and offline UX tradeoffs. Be prepared to discuss testing strategies (property testing, chaos/network partition simulations).
Further reading
-
Designing Data-Intensive Applications — Martin Kleppmann — solid background on replication, logs, and CRDTs in distributed systems.
-
A comprehensive study of CRDTs — Shapiro et al. (2011) — formal foundations for CRDT design and tradeoffs.
-
Automerge / Yjs docs (project repositories) — practical implementations and API-level considerations for client-side CRDTs.
Related concepts
- Adobe Creative Cloud Offline Sync And Conflict Resolution
- Adobe Document Cloud real-time collaboration and offline sync
- Adobe Creative Cloud Offline Sync And Conflict Resolution
- Adobe Creative Cloud Asset Sync And Conflict Resolution
- Adobe Real-Time Collaboration WebSockets
- Adobe Real-Time Collaboration And WebSockets