PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Microsoft

Extract stream prefix before stop token

Last updated: Mar 29, 2026

Quick Overview

This question evaluates streaming string processing, incremental substring detection across chunk boundaries, and memory-efficient algorithm design for low-latency consumption of potentially large text streams.

  • hard
  • Microsoft
  • Coding & Algorithms
  • Software Engineer

Extract stream prefix before stop token

Company: Microsoft

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

You are consuming a text stream delivered as an iterator/generator of **string chunks** (each chunk may be any length, including empty). You are also given a **stop token** string `token`. Concatenate the chunks conceptually into one long string `S`. Your task is to return the **prefix of `S` that appears strictly before the first occurrence of `token`**. Key requirements: - The stop token may be **split across chunk boundaries** (e.g., `"ab"` in one chunk and `"c"` in the next forms token `"abc"`). - You must be able to **return as soon as the first full token is detected**. You are **not allowed** to buffer the entire stream and then search. - Aim for low extra memory usage (e.g., proportional to `len(token)` rather than total stream size). If the token never appears, return the entire concatenated stream content (up to end-of-stream). ### Function signature (example) - Input: `chunks: Iterator[str]`, `token: str` - Output: `str` ### Examples 1) `chunks = ["hello ", "wor", "ld<END>", "ignored"]`, `token = "<END>"` → return `"hello world"` 2) `chunks = ["abc", "d", "ef"]`, `token = "cde"` (token crosses boundary) → return `"ab"` 3) `chunks = ["aaaaa"]`, `token = "b"` → return `"aaaaa"` ### Constraints (reasonable interview assumptions) - `1 <= len(token) <= 10^5` - Total streamed characters can be large (potentially millions), so the solution should be close to linear time in the number of consumed characters. Implement the function to satisfy the requirements above.

Quick Answer: This question evaluates streaming string processing, incremental substring detection across chunk boundaries, and memory-efficient algorithm design for low-latency consumption of potentially large text streams.

Related Interview Questions

  • Return Top K Open Businesses - Microsoft (hard)
  • Implement Memory Allocation and In-Memory Records - Microsoft (medium)
  • Implement K-Means and Detect Divisible Subarrays - Microsoft (medium)
  • Sort Three Categories In Place - Microsoft (medium)
  • Retain Top K Elements - Microsoft (medium)
Microsoft logo
Microsoft
Feb 1, 2026, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
3
0
Loading...

You are consuming a text stream delivered as an iterator/generator of string chunks (each chunk may be any length, including empty). You are also given a stop token string token.

Concatenate the chunks conceptually into one long string S. Your task is to return the prefix of S that appears strictly before the first occurrence of token.

Key requirements:

  • The stop token may be split across chunk boundaries (e.g., "ab" in one chunk and "c" in the next forms token "abc" ).
  • You must be able to return as soon as the first full token is detected . You are not allowed to buffer the entire stream and then search.
  • Aim for low extra memory usage (e.g., proportional to len(token) rather than total stream size).

If the token never appears, return the entire concatenated stream content (up to end-of-stream).

Function signature (example)

  • Input: chunks: Iterator[str] , token: str
  • Output: str

Examples

  1. chunks = ["hello ", "wor", "ld<END>", "ignored"] , token = "<END>" → return "hello world"
  2. chunks = ["abc", "d", "ef"] , token = "cde" (token crosses boundary) → return "ab"
  3. chunks = ["aaaaa"] , token = "b" → return "aaaaa"

Constraints (reasonable interview assumptions)

  • 1 <= len(token) <= 10^5
  • Total streamed characters can be large (potentially millions), so the solution should be close to linear time in the number of consumed characters.

Implement the function to satisfy the requirements above.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Microsoft•More Software Engineer•Microsoft Software Engineer•Microsoft Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.