PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Sealth

Implement Beam Search With Length Normalization

Last updated: Apr 22, 2026

Quick Overview

This question evaluates understanding and implementation of sequence decoding algorithms—greedy decoding and beam search—alongside sequence scoring with cumulative log-probabilities and length normalization techniques in the Machine Learning domain.

  • easy
  • Sealth
  • Machine Learning
  • Machine Learning Engineer

Implement Beam Search With Length Normalization

Company: Sealth

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: easy

Interview Round: Onsite

In a sequence generation model, you are given: - a start token `<bos>` - an end token `<eos>` - a maximum output length `max_len` - a beam size `k` - a function `next_log_probs(prefix)` that returns the log-probability of each possible next token given the current prefix Write a decoder that: 1. Performs greedy decoding when `k = 1`. 2. Performs beam search when `k > 1`, keeping the top `k` candidate hypotheses at each decoding step. 3. Uses cumulative log-probability to score sequences. 4. Returns the best completed sequence, or the best partial sequence if no candidate reaches `<eos>` before `max_len`. Then explain why using raw cumulative log-probability tends to penalize longer sequences, and describe how to reduce this bias using a method such as average log-probability or length normalization.

Quick Answer: This question evaluates understanding and implementation of sequence decoding algorithms—greedy decoding and beam search—alongside sequence scoring with cumulative log-probabilities and length normalization techniques in the Machine Learning domain.

Related Interview Questions

  • Represent k-means as an MLP - Sealth (easy)
Sealth logo
Sealth
Apr 19, 2026, 12:00 AM
Machine Learning Engineer
Onsite
Machine Learning
10
0
Loading...

In a sequence generation model, you are given:

  • a start token <bos>
  • an end token <eos>
  • a maximum output length max_len
  • a beam size k
  • a function next_log_probs(prefix) that returns the log-probability of each possible next token given the current prefix

Write a decoder that:

  1. Performs greedy decoding when k = 1 .
  2. Performs beam search when k > 1 , keeping the top k candidate hypotheses at each decoding step.
  3. Uses cumulative log-probability to score sequences.
  4. Returns the best completed sequence, or the best partial sequence if no candidate reaches <eos> before max_len .

Then explain why using raw cumulative log-probability tends to penalize longer sequences, and describe how to reduce this bias using a method such as average log-probability or length normalization.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Sealth•More Machine Learning Engineer•Sealth Machine Learning Engineer•Sealth Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.