You are processing a stream of characters (arriving one-by-one). You are also given a list of target words.
Each time a new character arrives, it is appended to the end of the current stream buffer. After appending, if the buffer contains any target word as a contiguous substring, you must:
-
Return/report
a matched word (assume if multiple words match, you may return any one of them, but you must define and follow a consistent rule such as “return the first match found” or “return the longest match”).
-
Delete
that occurrence of the matched word from the buffer (remove exactly those characters), then continue processing future characters.
Clarifications to make explicit in your solution
-
Can matches overlap? If multiple matches exist simultaneously, which one do you remove?
-
Do you remove only one word per incoming character, or repeatedly remove until no word remains?
Input/Output (one reasonable interface)
-
Input:
words: List[str]
, and a sequence of calls
query(c: char)
.
-
Output per call: the matched word (or empty string / null if no match). The internal buffer should reflect deletions.
Constraints (typical interview-scale)
-
Number of words: up to ~10^4
-
Total length of all words: up to ~10^5
-
Stream length can be large (online processing required).