PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Box

Compute Top-K word frequencies under a path

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of text parsing and frequency-counting algorithms, external-memory and distributed processing techniques, and streaming/approximate heavy-hitter methods with attention to accuracy/latency/storage trade-offs.

  • Medium
  • Box
  • Coding & Algorithms
  • Software Engineer

Compute Top-K word frequencies under a path

Company: Box

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

Given a filesystem path that may contain nested subdirectories and files, compute the top K most frequent words across all files. Describe an in-memory solution and its complexity. Follow-up: when the corpus is too large to fit in memory, propose scalable approaches (e.g., external sorting/partitioning, MapReduce-style sharding and merge) and an approximate heavy-hitters approach (e.g., Count–Min Sketch with Space-Saving), including accuracy/latency/storage trade-offs.

Quick Answer: This question evaluates understanding of text parsing and frequency-counting algorithms, external-memory and distributed processing techniques, and streaming/approximate heavy-hitter methods with attention to accuracy/latency/storage trade-offs.

Related Interview Questions

  • Implement a Leaky Bucket Rate Limiter - Box (easy)
  • Solve classic troubleshooting & algorithm tasks - Box (Medium)
  • Design out-of-order windowed stream processor - Box (Medium)
  • Flip a specific bit in an integer - Box (Medium)
Box logo
Box
Aug 1, 2025, 12:00 AM
Software Engineer
Onsite
Coding & Algorithms
15
0

Given a filesystem path that may contain nested subdirectories and files, compute the top K most frequent words across all files. Describe an in-memory solution and its complexity. Follow-up: when the corpus is too large to fit in memory, propose scalable approaches (e.g., external sorting/partitioning, MapReduce-style sharding and merge) and an approximate heavy-hitters approach (e.g., Count–Min Sketch with Space-Saving), including accuracy/latency/storage trade-offs.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Box•More Software Engineer•Box Software Engineer•Box Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.