This question evaluates streaming string processing and pattern-matching skills, specifically handling token detection across chunk boundaries, boundary conditions, and time/space efficiency. It is commonly asked in the Coding & Algorithms domain to assess practical application-level competence in stream-processing algorithms rather than purely theoretical understanding.
You are implementing a simplified streaming conversational AI output filter.
Text arrives in chunks (strings) in order. There is a special stop token (e.g., "<END>"). Your job is to output only the text that occurs before the first stop token, and ignore everything from the stop token onward.
Given:
chunks
: a list of strings representing the stream chunks in arrival order
stopToken
: a non-empty string (e.g.,
"<END>"
)
Return a list of strings representing what should be emitted to the user:
stopToken
.
stopToken
itself.
stopToken
occurs in the middle of a chunk, emit only the prefix before it.
chunks = ["hello", "how are you<END>", "HAPPY"]
["hello", "how are you"]
chunks = ["hello", "how are you<E", "ND>HAPPY"]
["hello", "how are you"]
len(stopToken)
).
(You may assume ASCII/UTF-8 strings; exact character set is not important as long as matching is consistent.)