This question evaluates streaming string processing, incremental substring detection across chunk boundaries, and memory-efficient algorithm design for low-latency consumption of potentially large text streams.
You are consuming a text stream delivered as an iterator/generator of string chunks (each chunk may be any length, including empty). You are also given a stop token string token.
Concatenate the chunks conceptually into one long string S. Your task is to return the prefix of S that appears strictly before the first occurrence of token.
Key requirements:
"ab"
in one chunk and
"c"
in the next forms token
"abc"
).
len(token)
rather than total stream size).
If the token never appears, return the entire concatenated stream content (up to end-of-stream).
chunks: Iterator[str]
,
token: str
str
chunks = ["hello ", "wor", "ld<END>", "ignored"]
,
token = "<END>"
→ return
"hello world"
chunks = ["abc", "d", "ef"]
,
token = "cde"
(token crosses boundary) → return
"ab"
chunks = ["aaaaa"]
,
token = "b"
→ return
"aaaaa"
1 <= len(token) <= 10^5
Implement the function to satisfy the requirements above.