Implement Autoregressive Decoding Strategies
Company: Cohere
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates proficiency with autoregressive sequence generation, probabilistic sampling methods (greedy, top-k, top-p), softmax-based probability manipulation, and correct handling of end-of-sequence tokens.
Constraints
- 1 <= vocab_size <= 100
- EOS token ID is vocab_size - 1
- `logits_by_context` contains logits for every context reached before EOS
- `method` is one of 'greedy', 'topk', or 'topp'
- For top-k, `param` must be an integer k such that 1 <= k <= vocab_size
- For top-p, `param` must satisfy 0 < p <= 1
- Every value in `random_values` must satisfy 0 <= r < 1
Examples
Input: ({(2,): [0.1, 2.0, -1.0, 0.0, 1.5], (2, 1): [0.0, -0.5, 3.0, 1.0, 2.0], (2, 1, 2): [-1.0, 0.0, 0.5, 1.0, 2.5]}, [2], 'greedy', 0, [])
Expected Output: [1, 2]
Explanation: Greedy chooses token 1, then token 2, then EOS token 4. EOS is not included in the returned output.
Input: ({(): [0.0, 0.0, -0.5, 1.0]}, [], 'greedy', 0, [])
Expected Output: []
Explanation: The empty prompt immediately generates EOS token 3, so no generated tokens are returned.
Input: ({(0,): [2.0, 1.0, 0.0, -1.0, 0.5], (0, 1): [0.0, 3.0, 2.0, 1.0, -1.0], (0, 1, 1): [-1.0, 0.0, 0.5, 1.0, 2.0]}, [0], 'topk', 2, [0.8, 0.2, 0.1])
Expected Output: [1, 1]
Explanation: With k=2, the first sample chooses token 1, the second sample chooses token 1 again, and the third sample chooses EOS token 4.
Input: ({(1,): [2.0, 1.0, 0.0, -1.0], (1, 1): [0.0, 0.0, 0.0, 2.0]}, [1], 'topp', 0.7, [0.9, 0.4])
Expected Output: [1]
Explanation: For the first step, top-p keeps tokens 0 and 1; random value 0.9 selects token 1. The next context's nucleus contains only EOS token 3.
Input: ({(5,): [0.0, 0.0, 0.0]}, [5], 'topp', 1.0, [0.7])
Expected Output: []
Explanation: With p=1.0 all tied tokens are included in token-ID order. Random value 0.7 selects token 2, which is EOS.
Hints
- Keep two sequences: the mutable current model input and the generated output tokens to return.
- For top-k and top-p, sort token IDs by probability descending, using token ID ascending as the tie-breaker.