You are consuming a text stream delivered as an iterator/generator of string chunks (each chunk may be any length, including empty). You are also given a stop token string token.
Concatenate the chunks conceptually into one long string S. Your task is to return the prefix of S that appears strictly before the first occurrence of token.
Key requirements:
-
The stop token may be
split across chunk boundaries
(e.g.,
"ab"
in one chunk and
"c"
in the next forms token
"abc"
).
-
You must be able to
return as soon as the first full token is detected
. You are
not allowed
to buffer the entire stream and then search.
-
Aim for low extra memory usage (e.g., proportional to
len(token)
rather than total stream size).
If the token never appears, return the entire concatenated stream content (up to end-of-stream).
Function signature (example)
-
Input:
chunks: Iterator[str]
,
token: str
-
Output:
str
Examples
-
chunks = ["hello ", "wor", "ld<END>", "ignored"]
,
token = "<END>"
→ return
"hello world"
-
chunks = ["abc", "d", "ef"]
,
token = "cde"
(token crosses boundary) → return
"ab"
-
chunks = ["aaaaa"]
,
token = "b"
→ return
"aaaaa"
Constraints (reasonable interview assumptions)
-
1 <= len(token) <= 10^5
-
Total streamed characters can be large (potentially millions), so the solution should be close to linear time in the number of consumed characters.
Implement the function to satisfy the requirements above.