Given a filesystem path that may contain nested subdirectories and files, compute the top K most frequent words across all files. Describe an in-memory solution and its complexity. Follow-up: when the corpus is too large to fit in memory, propose scalable approaches (e.g., external sorting/partitioning, MapReduce-style sharding and merge) and an approximate heavy-hitters approach (e.g., Count–Min Sketch with Space-Saving), including accuracy/latency/storage trade-offs.