PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/OpenAI

Design a GPT chat UI with snapshots and sharing

Last updated: Mar 29, 2026

Quick Overview

This question evaluates system-design and engineering competencies for building a multi-tenant LLM chat web application, covering real-time token streaming, snapshot/versioned data modeling, full-text and metadata search, API and schema design, role-based sharing, security, and frontend state management.

  • hard
  • OpenAI
  • System Design
  • Software Engineer

Design a GPT chat UI with snapshots and sharing

Company: OpenAI

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design an end-to-end web application for interacting with a GPT-like model. It must: ( 1) let users enter a prompt, submit, establish a session/connection to the model, and stream tokens in real time to the UI; ( 2) save a snapshot of the current state, including conversation messages, system prompt, model/version, and tuning parameters (e.g., temperature, top_p) with timestamps; ( 3) support full-text and metadata search over saved snapshots (by content, tags, creator, date); and ( 4) allow sharing snapshots with others via links and role-based access (public/unlisted/team; view/comment/duplicate). Describe the high-level architecture (frontend, backend, data stores, search/indexing), API design, data schemas, and how you would implement real-time transport (SSE vs WebSocket), streaming backpressure, idempotency, retries, and error handling. Cover authentication/authorization, multi-tenant isolation, rate limiting, auditing, cost tracking, and privacy/compliance considerations. Explain frontend state management for live streaming (pause/resume, partial updates, optimistic UI), snapshot versioning, and how to prevent data loss on network interruptions. Discuss scalability (stateless services, session affinity, caching), consistency choices, observability (logs, traces, metrics), deployment strategy, and capacity estimation. Outline testing strategies (unit, contract, E2E) and how you would enable offline drafts that later sync.

Quick Answer: This question evaluates system-design and engineering competencies for building a multi-tenant LLM chat web application, covering real-time token streaming, snapshot/versioned data modeling, full-text and metadata search, API and schema design, role-based sharing, security, and frontend state management.

Related Interview Questions

  • Design Video Generation Orchestration - OpenAI (medium)
  • Design CI/CD Build Caching - OpenAI
  • Design an Instagram-like Feed System - OpenAI (medium)
  • Design Online Chess Matchmaking - OpenAI (hard)
  • Design Android MVVM API Architecture - OpenAI (medium)
OpenAI logo
OpenAI
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
12
0

System Design: End-to-End Web App for Interacting with a GPT-like Model

Context

You are designing a multi-tenant, browser-based SaaS application that allows users to interact with a GPT-like LLM. The app must support real-time token streaming, snapshotting conversations, search over saved artifacts, and sharing with role-based access. Assume you can call out to an external model provider and you control the rest of the stack.

Requirements

  1. Real-time chat
    • Let users enter a prompt, submit, establish a session/connection to the model, and stream tokens to the UI in real time.
  2. Snapshots
    • Save a snapshot of the current state, including: conversation messages, system prompt, model/version, tuning parameters (temperature, top_p), with timestamps.
  3. Search
    • Support full-text and metadata search over saved snapshots by content, tags, creator, and date.
  4. Sharing and access control
    • Share snapshots via links and role-based access: public/unlisted/team; roles include view/comment/duplicate.

Deliverables

Describe:

  • High-level architecture (frontend, backend, data stores, search/indexing)
  • API design (key endpoints and contracts)
  • Data schemas (core entities and relationships)
  • Real-time transport choice (SSE vs WebSocket), streaming backpressure, idempotency, retries, error handling
  • Authentication/authorization, multi-tenant isolation, rate limiting, auditing, cost tracking, privacy/compliance
  • Frontend state management for live streaming (pause/resume, partial updates, optimistic UI), snapshot versioning, preventing data loss on network interruptions
  • Scalability (stateless services, session affinity, caching), consistency choices, observability (logs, traces, metrics), deployment strategy, capacity estimation
  • Testing strategies (unit, contract, E2E) and how to enable offline drafts that later sync

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.