Design ChatGPT homepage with streaming choices

Q: Design ChatGPT homepage with streaming choices

This is a ML System Design interview question from OpenAI for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

System Design: ChatGPT‑Style Homepage with Streaming

Goal

Design a ChatGPT‑style web homepage end to end. Users should type a prompt and see the model’s response stream token‑by‑token in the browser.

Requirements

Frontend
1. Render a chat UI (messages, input box, streaming cursor, retry/stop, multi‑tab resilience).
2. Stream tokens to the UI with low latency and graceful reconnection.
3. Persist conversations and support pagination/search.
Backend
1. Provide a server endpoint that calls a Chat Completions API with streaming.
2. Authenticate users and protect provider credentials.
3. Rate limit users and enforce token/concurrency budgets.
4. Store conversation state (messages, metadata, token counts).
5. Stream tokens to the browser (SSE or WebSockets) with backpressure, retries, and timeouts.
6. Log, metric, and trace requests end‑to‑end; scale under load with cost controls.

Streaming Transport Comparison

Describe when to use SSE vs WebSockets for streaming tokens, including trade‑offs in:

Latency
Reliability and ordering
Backpressure handling
Reconnection semantics
Browser/proxy support and operational complexity

Integration Details

Show how you would integrate a Chat Completions API, including:

Authentication (user identity and server‑to‑provider secrets)
Rate limiting (per user/IP, concurrency, token budgets)
Conversation state storage (schema, summarization, limits)
Streaming tokenization path (upstream to backend to browser)
Error handling and retries (transient vs permanent)
Observability (logs, metrics, traces)
Scalability and cost considerations

Provide concrete architectural choices, sequence of events, and concise code snippets or pseudocode to illustrate the streaming path for both SSE and WebSockets.