PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates understanding of online stream processing, substring matching, and the use of efficient data structures and algorithms for real-time updates and deletions.

  • medium
  • Google
  • Coding & Algorithms
  • Software Engineer

Detect and remove matched words in a char stream

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

You are processing a stream of characters (arriving one-by-one). You are also given a list of target words. Each time a new character arrives, it is appended to the end of the current stream buffer. After appending, if the buffer contains **any** target word as a **contiguous substring**, you must: 1. **Return/report** a matched word (assume if multiple words match, you may return any one of them, but you must define and follow a consistent rule such as “return the first match found” or “return the longest match”). 2. **Delete** that occurrence of the matched word from the buffer (remove exactly those characters), then continue processing future characters. ### Clarifications to make explicit in your solution - Can matches overlap? If multiple matches exist simultaneously, which one do you remove? - Do you remove only one word per incoming character, or repeatedly remove until no word remains? ### Input/Output (one reasonable interface) - Input: `words: List[str]`, and a sequence of calls `query(c: char)`. - Output per call: the matched word (or empty string / null if no match). The internal buffer should reflect deletions. ### Constraints (typical interview-scale) - Number of words: up to ~10^4 - Total length of all words: up to ~10^5 - Stream length can be large (online processing required).

Quick Answer: This question evaluates understanding of online stream processing, substring matching, and the use of efficient data structures and algorithms for real-time updates and deletions.

Implement a batch version of an online character-stream processor. You are given a list of target words and a stream string. Process the stream left to right, as if each character were passed to query(c). The buffer starts empty. For each incoming character, append it to the buffer. If one or more target words now appear as contiguous substrings, remove exactly one matched word using this rule: remove the longest target word that is a suffix of the current buffer; if there is still a tie, choose the one whose earliest occurrence in words comes first. Report the removed word for that character, or an empty string if nothing is removed. Return all reports in order, along with the final buffer. Overlaps are allowed. Only one deletion can happen per incoming character, because after every step the buffer is clean, so any newly created match must end at the newest character.

Constraints

  • 0 <= len(words) <= 10^4
  • 0 <= sum(len(word) for word in words) <= 10^5
  • 0 <= len(stream) <= 10^5
  • Each word is non-empty and consists only of lowercase English letters
  • stream consists only of lowercase English letters

Examples

Input: (['ab', 'bc'], 'zabcbc')

Expected Output: (['', '', 'ab', '', '', 'bc'], 'zc')

Explanation: After the 3rd character, the buffer is 'zab', so 'ab' is removed. Later, after the 6th character, the buffer is 'zcbc', so 'bc' is removed, leaving 'zc'.

Input: (['ba', 'cba'], 'xcba')

Expected Output: (['', '', '', 'cba'], 'x')

Explanation: At the last step, both 'ba' and 'cba' are suffixes of 'xcba'. The longest suffix match is 'cba', so it is removed.

Input: (['a', 'aa'], '')

Expected Output: ([], '')

Explanation: An empty stream produces no reports, and the final buffer stays empty.

Input: ([], 'abc')

Expected Output: (['', '', ''], 'abc')

Explanation: With no target words, nothing can ever be removed.

Input: (['abc', 'bc'], 'abcbc')

Expected Output: (['', '', 'abc', '', 'bc'], '')

Explanation: At the 3rd character, 'abc' and 'bc' both match as suffixes, so the longer word 'abc' is removed. Later, 'bc' is removed at the end.

Hints

  1. Because the buffer is fully cleaned after each character, any new match created by appending a character must end at that character. So you only need to detect matching suffixes of the current buffer.
  2. Use a trie with failure links (Aho-Corasick). If you also keep a stack of automaton states for the current buffer, you can remove a matched suffix by popping characters and states.
Last updated: May 18, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Solve Flower Placement and Directory Deletion - Google (medium)
  • Compute Turnstile Crossing Times - Google (hard)
  • Simulate In-Place Cellular State Updates - Google (hard)
  • Determine Whether a Word Exists in a Graph - Google (medium)
  • Implement Checksums and Feature Rollout Evaluation - Google (medium)