PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Inception

Implement Autoregressive Decoding in PyTorch

Last updated: Jun 5, 2026

Quick Overview

This question evaluates a candidate's ability to implement autoregressive decoding and sampling strategies in PyTorch, testing competencies in sequence generation, probabilistic sampling methods (greedy, temperature, top-k, top-p), batching, and end-of-sequence handling.

  • medium
  • Inception
  • Machine Learning
  • Machine Learning Engineer

Implement Autoregressive Decoding in PyTorch

Company: Inception

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

Implement an autoregressive text-generation function in PyTorch. You are given a language model that, for an input tensor of token IDs, returns logits for the next-token distribution at every position. Assume the model can be called as: ```python logits = model(input_ids) ``` where: - `input_ids` has shape `[batch_size, current_sequence_length]`. - `logits` has shape `[batch_size, current_sequence_length, vocab_size]`. - The logits for the next token should be taken from `logits[:, -1, :]`. Write a generation function that supports common decoding strategies: 1. Greedy decoding. 2. Temperature sampling. 3. Top-k sampling. 4. Top-p / nucleus sampling. The function should repeatedly generate one token at a time until either: - `max_new_tokens` tokens have been generated, or - every sequence in the batch has produced `eos_token_id`, if provided. Discuss important implementation details and edge cases.

Quick Answer: This question evaluates a candidate's ability to implement autoregressive decoding and sampling strategies in PyTorch, testing competencies in sequence generation, probabilistic sampling methods (greedy, temperature, top-k, top-p), batching, and end-of-sequence handling.

Inception logo
Inception
Jan 10, 2026, 12:00 AM
Machine Learning Engineer
Technical Screen
Machine Learning
1
0

Implement an autoregressive text-generation function in PyTorch.

You are given a language model that, for an input tensor of token IDs, returns logits for the next-token distribution at every position.

Assume the model can be called as:

logits = model(input_ids)

where:

  • input_ids has shape [batch_size, current_sequence_length] .
  • logits has shape [batch_size, current_sequence_length, vocab_size] .
  • The logits for the next token should be taken from logits[:, -1, :] .

Write a generation function that supports common decoding strategies:

  1. Greedy decoding.
  2. Temperature sampling.
  3. Top-k sampling.
  4. Top-p / nucleus sampling.

The function should repeatedly generate one token at a time until either:

  • max_new_tokens tokens have been generated, or
  • every sequence in the batch has produced eos_token_id , if provided.

Discuss important implementation details and edge cases.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Inception•More Machine Learning Engineer•Inception Machine Learning Engineer•Inception Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.