PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Software Engineering Fundamentals/OpenAI

Build a Reliable Streaming Chat UI

Last updated: Jun 24, 2026

Quick Overview

This question evaluates understanding of real-time front-end architecture and state management in React, focusing on transient streaming state, concurrency control, UI consistency during tokenized updates, and the appropriate use of hooks such as useRef versus useState in the Software Engineering Fundamentals domain.

  • hard
  • OpenAI
  • Software Engineering Fundamentals
  • Software Engineer

Build a Reliable Streaming Chat UI

Company: OpenAI

Role: Software Engineer

Category: Software Engineering Fundamentals

Difficulty: hard

Interview Round: HR Screen

You are building a React-based chat interface where assistant responses stream into the UI **token by token** in real time (the same UX as a typical LLM chat product). The product allows a user to send a new message while a previous response is still streaming, and may show more than one conversation thread. Before writing any code, walk the interviewer through how you would **design and build this feature**. Treat it as a verbal design discussion: explain your architecture, the React state model, and how you keep the UI correct and smooth under concurrency and failure. Your answer should address: - The **architecture and data flow** for live updates (transport, identifiers, how a chunk reaches the right message). - **Two pieces of state that exist only while a stream is active** and must be cleaned up when it finishes. - How to manage React state so streamed tokens **do not flicker or overwrite** existing content. - What can **go wrong when multiple responses stream at once**, and how to **prevent race conditions** between concurrent streams. - When you would reach for **`useRef`** versus **`useState`** in this scenario. - How you would keep the feature **reliable**: if the same network request is accidentally sent twice (double-submit, retry, reconnect), how do you prevent it from being processed/rendered twice? ```hint Where to start Frame the problem around **identity and ownership of a chunk**. Before any React detail, decide: what is the transport (SSE / WebSocket / streamed `fetch` `ReadableStream`), and what stable IDs (`conversationId`, `messageId`, `streamId`, optional `seq`) ride on every chunk so it can always be routed to the correct message. ``` ```hint React state model Separate **render state** (the message list and visible partial text) from **operational bookkeeping** that should NOT cause a re-render on every change. Think about updating messages by a stable `messageId` with a *functional* updater, and whether a `useReducer` over stream events (`STREAM_STARTED` / `CHUNK_RECEIVED` / `STREAM_COMPLETED` / `STREAM_FAILED`) is cleaner than ad-hoc `setState` calls from inside async callbacks. ``` ```hint Concurrency & stale updates The classic bug: an async callback closes over a **stale `messages`** snapshot, or a slow old stream writes after a newer one started. Consider per-stream metadata keyed by `streamId`, ignoring chunks whose stream is no longer active, and a "latest request id" guard to drop responses you no longer care about. ``` ```hint Idempotency For the double-send case, think about a client-generated **idempotency key** (request UUID) sent with the request, plus server-side dedup, and a client-side guard so a duplicate response can't create a second assistant message. ``` ### Constraints & Assumptions - Frontend is **React** (function components + hooks); assume a modern React (18+) with concurrent rendering and `StrictMode` double-invocation of effects in development. - Responses arrive as a sequence of small text deltas; a single response can be hundreds to thousands of tokens, so naive "re-render per token" must be considered for cost. - The user can **start a new message before the previous stream finishes**, and can **cancel/stop** a streaming response. - The network is unreliable: requests can time out, the connection can drop mid-stream, and a retry or React StrictMode remount can cause the **same request to be sent twice**. - Assume a backend exists that can stream a response; you are not designing the model, only the client (you may state minimal contract requirements you need from the server). ### Clarifying Questions to Ask - What is the streaming **transport** — Server-Sent Events, WebSocket, or a streamed HTTP body via `fetch`? Is it fixed or my choice? - Can a user have **multiple responses streaming concurrently** (e.g. across threads), or is it strictly one active response per conversation? - On a new submit mid-stream, should the previous response be **cancelled/superseded** or allowed to finish in the background? - Does the server guarantee **in-order delivery** of chunks, or do I need a sequence number to reorder? - What does the server send to mark **completion vs. error**, and does it echo back the `messageId`/`streamId` and any sequence numbers I send? - Is there a **server-side idempotency/dedup** mechanism I can rely on, or must the client be the sole defense against double-processing? ### What a Strong Answer Covers - **Transport & contract:** names a concrete transport and the per-chunk contract (stable IDs, completion/error markers, optional `seq`); routes chunks by **`messageId`**, never by array index. - **Lifecycle of a message:** optimistic assistant message on send → append deltas as chunks arrive → mark complete/error on the terminal event; explicit handling of cancel, timeout, and mid-stream disconnect. - **Ephemeral vs durable state:** correctly identifies stream-only state (e.g. `AbortController`, token buffer, expected `seq`, `isStreaming`/`activeStreamIds`) and explains *when* it is torn down. - **Flicker-free rendering:** functional `setState`/`useReducer` updates, appending deltas (not replacing from a stale snapshot), and buffering/flushing on `requestAnimationFrame` or an interval to avoid a render per token. - **Concurrency correctness:** enumerates the real failure modes (out-of-order chunks, slow old stream overwriting a newer one, wrong-message writes, cancelling the wrong request) and concrete mitigations (per-`streamId` map, ignore-inactive-stream guard, latest-request-id check, supersede-on-new-send when single-active). - **`useRef` vs `useState` reasoning:** a clear rule — render-affecting data in state/reducer, mutable operational handles (controllers, registries, buffers, timers, socket/EventSource) in refs to avoid re-renders and stale closures. - **Reliability / idempotency:** client-generated idempotency key, server dedup, and a client guard so a duplicated request never produces a duplicate assistant message; awareness of StrictMode double-effect as a *source* of accidental double-sends. ### Follow-up Questions - React 18 `StrictMode` mounts effects twice in development. How does that interact with opening a stream in `useEffect`, and how do you make stream setup/teardown idempotent so you don't open two connections? - The connection drops at token 400 of an 800-token response. How do you recover — resume from an offset, restart, or surface a partial message — and what does the server contract need to support your choice? - Rendering every token re-renders a long message list and tanks performance. Walk through how you'd diagnose this and the specific techniques (buffering, memoization, virtualization, isolating the streaming message) you'd apply. - You decide to move the whole streaming state machine out of React component state. Where would it live (a store / external state container / custom hook), and what do you gain and lose versus keeping it in `useReducer`?

Quick Answer: This question evaluates understanding of real-time front-end architecture and state management in React, focusing on transient streaming state, concurrency control, UI consistency during tokenized updates, and the appropriate use of hooks such as useRef versus useState in the Software Engineering Fundamentals domain.

Related Interview Questions

  • Implement a Simple Memory Allocator - OpenAI (medium)
  • Implement an Extensible Chatbot App - OpenAI (medium)
  • Design an Extensible Simulation Engine - OpenAI (hard)
  • Debug a Concurrent Job Scheduler - OpenAI (medium)
  • Model particle hits on a screen - OpenAI (hard)
|Home/Software Engineering Fundamentals/OpenAI

Build a Reliable Streaming Chat UI

OpenAI logo
OpenAI
Apr 4, 2026, 12:00 AM
hardSoftware EngineerHR ScreenSoftware Engineering Fundamentals
11
0

You are building a React-based chat interface where assistant responses stream into the UI token by token in real time (the same UX as a typical LLM chat product). The product allows a user to send a new message while a previous response is still streaming, and may show more than one conversation thread.

Before writing any code, walk the interviewer through how you would design and build this feature. Treat it as a verbal design discussion: explain your architecture, the React state model, and how you keep the UI correct and smooth under concurrency and failure. Your answer should address:

  • The architecture and data flow for live updates (transport, identifiers, how a chunk reaches the right message).
  • Two pieces of state that exist only while a stream is active and must be cleaned up when it finishes.
  • How to manage React state so streamed tokens do not flicker or overwrite existing content.
  • What can go wrong when multiple responses stream at once , and how to prevent race conditions between concurrent streams.
  • When you would reach for useRef versus useState in this scenario.
  • How you would keep the feature reliable : if the same network request is accidentally sent twice (double-submit, retry, reconnect), how do you prevent it from being processed/rendered twice?

Constraints & Assumptions

  • Frontend is React (function components + hooks); assume a modern React (18+) with concurrent rendering and StrictMode double-invocation of effects in development.
  • Responses arrive as a sequence of small text deltas; a single response can be hundreds to thousands of tokens, so naive "re-render per token" must be considered for cost.
  • The user can start a new message before the previous stream finishes , and can cancel/stop a streaming response.
  • The network is unreliable: requests can time out, the connection can drop mid-stream, and a retry or React StrictMode remount can cause the same request to be sent twice .
  • Assume a backend exists that can stream a response; you are not designing the model, only the client (you may state minimal contract requirements you need from the server).

Clarifying Questions to Ask

  • What is the streaming transport — Server-Sent Events, WebSocket, or a streamed HTTP body via fetch ? Is it fixed or my choice?
  • Can a user have multiple responses streaming concurrently (e.g. across threads), or is it strictly one active response per conversation?
  • On a new submit mid-stream, should the previous response be cancelled/superseded or allowed to finish in the background?
  • Does the server guarantee in-order delivery of chunks, or do I need a sequence number to reorder?
  • What does the server send to mark completion vs. error , and does it echo back the messageId / streamId and any sequence numbers I send?
  • Is there a server-side idempotency/dedup mechanism I can rely on, or must the client be the sole defense against double-processing?

What a Strong Answer Covers

  • Transport & contract: names a concrete transport and the per-chunk contract (stable IDs, completion/error markers, optional seq ); routes chunks by messageId , never by array index.
  • Lifecycle of a message: optimistic assistant message on send → append deltas as chunks arrive → mark complete/error on the terminal event; explicit handling of cancel, timeout, and mid-stream disconnect.
  • Ephemeral vs durable state: correctly identifies stream-only state (e.g. AbortController , token buffer, expected seq , isStreaming / activeStreamIds ) and explains when it is torn down.
  • Flicker-free rendering: functional setState / useReducer updates, appending deltas (not replacing from a stale snapshot), and buffering/flushing on requestAnimationFrame or an interval to avoid a render per token.
  • Concurrency correctness: enumerates the real failure modes (out-of-order chunks, slow old stream overwriting a newer one, wrong-message writes, cancelling the wrong request) and concrete mitigations (per- streamId map, ignore-inactive-stream guard, latest-request-id check, supersede-on-new-send when single-active).
  • useRef vs useState reasoning: a clear rule — render-affecting data in state/reducer, mutable operational handles (controllers, registries, buffers, timers, socket/EventSource) in refs to avoid re-renders and stale closures.
  • Reliability / idempotency: client-generated idempotency key, server dedup, and a client guard so a duplicated request never produces a duplicate assistant message; awareness of StrictMode double-effect as a source of accidental double-sends.

Follow-up Questions

  • React 18 StrictMode mounts effects twice in development. How does that interact with opening a stream in useEffect , and how do you make stream setup/teardown idempotent so you don't open two connections?
  • The connection drops at token 400 of an 800-token response. How do you recover — resume from an offset, restart, or surface a partial message — and what does the server contract need to support your choice?
  • Rendering every token re-renders a long message list and tanks performance. Walk through how you'd diagnose this and the specific techniques (buffering, memoization, virtualization, isolating the streaming message) you'd apply.
  • You decide to move the whole streaming state machine out of React component state. Where would it live (a store / external state container / custom hook), and what do you gain and lose versus keeping it in useReducer ?
Loading comments...

Browse More Questions

More Software Engineering Fundamentals•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI Software Engineering Fundamentals•Software Engineer Software Engineering Fundamentals

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.