Given a dictionary that maps each token to a dictionary of next-token probabilities (token -> {next_token: prob}) and a start token (e.g., "I" or " "), implement two decoders for sequence generation: ( 1) Greedy search: at each step, pick the highest-probability next token and continue until you hit a terminal token '.' or a token with no outgoing options. Provide both recursive and iterative implementations and return the generated token sequence. ( 2) Beam search: using a BFS-style expansion with beam size k, from each partial sequence keep only the top-k continuations at each step (ranked by cumulative log-probability); stop when all beams end with '.' or no continuations exist. Return the highest-probability completed sequence, and optionally all completed sequences. Clearly specify function signatures, tie-breaking for equal probabilities, and analyze time and space complexity in terms of sequence length L, branching factor b, vocabulary size |V|, and beam size k.

This question evaluates understanding of sequence decoding and search algorithms (greedy and beam search), probabilistic next-token handling, deterministic tie-breaking, and algorithmic analysis including time and space complexity.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a medium difficulty Machine Learning question, commonly asked during Technical Screen rounds at Cresta.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Cresta during technical interviews.

Implement greedy and beam decoding | Cresta Interview Question

Implement Greedy and Beam Search Decoders over Next-Token Probabilities

Context

You are given a directed token graph represented as a Python dictionary mapping each token to a dictionary of next-token probabilities:

transitions: token -> { next_token: probability }
A start token (e.g., "I" or " <START> ")
A terminal token '.' that ends the sequence, or a token with no outgoing options (missing key or empty dict)

Probabilities are non-negative and intended to be conditional next-token probabilities. You may assume they are normalized per node, but your implementation should be robust if they are not perfectly normalized.

Tasks

Greedy Search

At each step, pick the highest-probability next token.
Stop when you reach '.' or a token with no outgoing options.
Provide both iterative and recursive implementations.
Return the generated token sequence (including the start token).

Beam Search (BFS-style)

Use beam size k. From each partial sequence, expand to all next tokens, then keep only the top-k partial sequences ranked by cumulative log-probability.
Stop when all beams end with '.' or when no continuations exist.
Return the highest-probability completed sequence, and optionally all completed sequences.

Requirements

Provide clear function signatures.
Specify deterministic tie-breaking for equal probabilities/scores.
Include time and space complexity in terms of: sequence length L, branching factor b (max outgoing options per token), vocabulary size |V|, and beam size k.

Context

You are given a directed token graph represented as a Python dictionary mapping each token to a dictionary of next-token probabilities:

transitions: token -> { next_token: probability }

A start token (e.g., "I" or " <START> ")

A terminal token '.' that ends the sequence, or a token with no outgoing options (missing key or empty dict)

Tasks

Greedy Search

At each step, pick the highest-probability next token.

Stop when you reach '.' or a token with no outgoing options.

Provide both iterative and recursive implementations.

Return the generated token sequence (including the start token).

Beam Search (BFS-style)

Use beam size k. From each partial sequence, expand to all next tokens, then keep only the top-k partial sequences ranked by cumulative log-probability.

Stop when all beams end with '.' or when no continuations exist.

Return the highest-probability completed sequence, and optionally all completed sequences.

Implement greedy and beam decoding

Quick Overview

Implement Greedy and Beam Search Decoders over Next-Token Probabilities

Context

Tasks

Requirements

Solution

Comments (0)

Implement greedy and beam decoding

Quick Overview

Implement Greedy and Beam Search Decoders over Next-Token Probabilities

Context

Tasks

Requirements

Solution

Comments (0)