Implement a class that preprocesses a document and supports repeated keyword lookups.
Requirements:
-
The constructor receives a text string containing lowercase words separated by non-letter characters.
-
query(term)
returns the zero-based starting character offsets of every exact whole-word match of
term
, sorted in ascending order.
-
The data structure should be optimized for many queries against the same text.
Follow-up:
Add fuzzy_query(term) so it also matches whole words that are within one insertion or one deletion of term. For example, a query for cat should be able to match cats and ct, but not cut.
Discuss the data structures you would use and the trade-offs between preprocessing cost, query latency, and memory usage.