PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Coding & Algorithms/Google

Build a next-word frequency predictor

Last updated: May 8, 2026

Quick Overview

This question evaluates understanding of data structures and algorithms for frequency counting and associative lookups, including preprocessing-versus-query trade-offs, tie-breaking, and handling edge cases in sequential data.

  • easy
  • Google
  • Coding & Algorithms
  • Software Engineer

Build a next-word frequency predictor

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Onsite

You are given a sequence of tokens (words) representing a corpus, e.g.: `tokens = [w0, w1, w2, ..., wK-1]` You need to build a **next-word predictor** that supports queries of the form: - `predict(word) -> next_word` Where `next_word` is the word that **most frequently follows** `word` in the corpus. More precisely: - For every adjacent pair `(tokens[i], tokens[i+1])`, count it as one occurrence of “`tokens[i+1]` follows `tokens[i]`”. - For a query `word = x`, return the `y` that maximizes `count(x, y)`. - If `x` never appears or never has a following word (e.g., it only appears as the last token), return a sentinel such as `null` / empty string. - If there is a tie for max frequency, you may return any tied word (or specify a deterministic tie-break, e.g., lexicographically smallest). ### Task Design the data structures and implement: 1. A one-time `build(tokens)` preprocessing step. 2. A `predict(word)` query. ### Constraints / expectations - The corpus is processed once (batch/offline build), then many queries may follow. - Aim for efficient preprocessing and fast queries.

Quick Answer: This question evaluates understanding of data structures and algorithms for frequency counting and associative lookups, including preprocessing-versus-query trade-offs, tie-breaking, and handling edge cases in sequential data.

Related Interview Questions

  • Compute Turnstile Crossing Times - Google (hard)
  • Simulate In-Place Cellular State Updates - Google (hard)
  • Determine Whether a Word Exists in a Graph - Google (medium)
  • Solve Shortest Paths and Rental Allocation - Google (medium)
  • Solve Two Array Optimization Problems - Google (medium)
Google logo
Google
Feb 12, 2026, 12:00 AM
Software Engineer
Onsite
Coding & Algorithms
27
0
Loading...

You are given a sequence of tokens (words) representing a corpus, e.g.:

tokens = [w0, w1, w2, ..., wK-1]

You need to build a next-word predictor that supports queries of the form:

  • predict(word) -> next_word

Where next_word is the word that most frequently follows word in the corpus.

More precisely:

  • For every adjacent pair (tokens[i], tokens[i+1]) , count it as one occurrence of “ tokens[i+1] follows tokens[i] ”.
  • For a query word = x , return the y that maximizes count(x, y) .
  • If x never appears or never has a following word (e.g., it only appears as the last token), return a sentinel such as null / empty string.
  • If there is a tie for max frequency, you may return any tied word (or specify a deterministic tie-break, e.g., lexicographically smallest).

Task

Design the data structures and implement:

  1. A one-time build(tokens) preprocessing step.
  2. A predict(word) query.

Constraints / expectations

  • The corpus is processed once (batch/offline build), then many queries may follow.
  • Aim for efficient preprocessing and fast queries.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Google•More Software Engineer•Google Software Engineer•Google Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.