Implement Greedy and Beam Search Decoders over Next-Token Probabilities
Context
You are given a directed token graph represented as a Python dictionary mapping each token to a dictionary of next-token probabilities:
-
transitions: token -> { next_token: probability }
-
A start token (e.g., "I" or "
<START>
")
-
A terminal token '.' that ends the sequence, or a token with no outgoing options (missing key or empty dict)
Probabilities are non-negative and intended to be conditional next-token probabilities. You may assume they are normalized per node, but your implementation should be robust if they are not perfectly normalized.
Tasks
-
Greedy Search
-
At each step, pick the highest-probability next token.
-
Stop when you reach '.' or a token with no outgoing options.
-
Provide both iterative and recursive implementations.
-
Return the generated token sequence (including the start token).
-
Beam Search (BFS-style)
-
Use beam size k. From each partial sequence, expand to all next tokens, then keep only the top-k partial sequences ranked by cumulative log-probability.
-
Stop when all beams end with '.' or when no continuations exist.
-
Return the highest-probability completed sequence, and optionally all completed sequences.
Requirements
-
Provide clear function signatures.
-
Specify deterministic tie-breaking for equal probabilities/scores.
-
Include time and space complexity in terms of: sequence length L, branching factor b (max outgoing options per token), vocabulary size |V|, and beam size k.