How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Google.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Google during technical interviews.

Detect and remove matched words in a char stream

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

You are processing a stream of characters (arriving one-by-one). You are also given a list of target words. Each time a new character arrives, it is appended to the end of the current stream buffer. After appending, if the buffer contains **any** target word as a **contiguous substring**, you must: 1. **Return/report** a matched word (assume if multiple words match, you may return any one of them, but you must define and follow a consistent rule such as “return the first match found” or “return the longest match”). 2. **Delete** that occurrence of the matched word from the buffer (remove exactly those characters), then continue processing future characters. ### Clarifications to make explicit in your solution - Can matches overlap? If multiple matches exist simultaneously, which one do you remove? - Do you remove only one word per incoming character, or repeatedly remove until no word remains? ### Input/Output (one reasonable interface) - Input: `words: List[str]`, and a sequence of calls `query(c: char)`. - Output per call: the matched word (or empty string / null if no match). The internal buffer should reflect deletions. ### Constraints (typical interview-scale) - Number of words: up to ~10^4 - Total length of all words: up to ~10^5 - Stream length can be large (online processing required).

Quick Answer: This question evaluates understanding of online stream processing, substring matching, and the use of efficient data structures and algorithms for real-time updates and deletions.

Implement a batch version of an online character-stream processor. You are given a list of target words and a stream string. Process the stream left to right, as if each character were passed to query(c). The buffer starts empty. For each incoming character, append it to the buffer. If one or more target words now appear as contiguous substrings, remove exactly one matched word using this rule: remove the longest target word that is a suffix of the current buffer; if there is still a tie, choose the one whose earliest occurrence in words comes first. Report the removed word for that character, or an empty string if nothing is removed. Return all reports in order, along with the final buffer. Overlaps are allowed. Only one deletion can happen per incoming character, because after every step the buffer is clean, so any newly created match must end at the newest character.

Constraints

0 <= len(words) <= 10^4
0 <= sum(len(word) for word in words) <= 10^5
0 <= len(stream) <= 10^5
Each word is non-empty and consists only of lowercase English letters
stream consists only of lowercase English letters

Examples

Input: (['ab', 'bc'], 'zabcbc')

Expected Output: (['', '', 'ab', '', '', 'bc'], 'zc')

Explanation: After the 3rd character, the buffer is 'zab', so 'ab' is removed. Later, after the 6th character, the buffer is 'zcbc', so 'bc' is removed, leaving 'zc'.

Input: (['ba', 'cba'], 'xcba')

Expected Output: (['', '', '', 'cba'], 'x')

Explanation: At the last step, both 'ba' and 'cba' are suffixes of 'xcba'. The longest suffix match is 'cba', so it is removed.

Input: (['a', 'aa'], '')

Expected Output: ([], '')

Explanation: An empty stream produces no reports, and the final buffer stays empty.

Input: ([], 'abc')

Expected Output: (['', '', ''], 'abc')

Explanation: With no target words, nothing can ever be removed.

Input: (['abc', 'bc'], 'abcbc')

Expected Output: (['', '', 'abc', '', 'bc'], '')

Explanation: At the 3rd character, 'abc' and 'bc' both match as suffixes, so the longer word 'abc' is removed. Later, 'bc' is removed at the end.

Hints

Because the buffer is fully cleaned after each character, any new match created by appending a character must end at that character. So you only need to detect matching suffixes of the current buffer.
Use a trie with failure links (Aho-Corasick). If you also keep a stack of automaton states for the current buffer, you can remove a matched suffix by popping characters and states.

Quick Overview

This question evaluates understanding of online stream processing, substring matching, and the use of efficient data structures and algorithms for real-time updates and deletions.