PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/OpenAI

Debug a broken Transformer implementation

Last updated: Apr 15, 2026

Quick Overview

This question evaluates a candidate's ability to debug and validate a Transformer implementation, focusing on attention masking, parameter initialization, loss alignment, and other implementation-level correctness issues.

  • hard
  • OpenAI
  • Machine Learning
  • Machine Learning Engineer

Debug a broken Transformer implementation

Company: OpenAI

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You are given a small Transformer model implementation (e.g., in PyTorch) plus a tiny training script. The code executes, but the model does not match a reference implementation: unit tests that check (1) the forward-pass output for a fixed input/seed and (2) the training loss for one step either fail or are inconsistent. Task: Debug the model so that it runs end-to-end and matches the expected outputs/loss. The buggy code contains multiple independent issues, including: 1) An error in the attention mask (shape/broadcasting or causal/padding masking is applied incorrectly). 2) Incorrect parameter initialization (some weights are initialized with the wrong distribution/scale or not initialized at all). 3) A bug in the loss computation due to misaligned positions (e.g., logits/labels are shifted incorrectly for next-token prediction). 4) One additional hidden bug of similar difficulty (e.g., wrong softmax dimension, missing attention scaling by sqrt(d_k), wrong dtype/device handling, dropout/eval-mode misuse, or an off-by-one in sequence lengths). Explain how you would systematically find and fix these issues, and what the correct implementations should look like.

Quick Answer: This question evaluates a candidate's ability to debug and validate a Transformer implementation, focusing on attention masking, parameter initialization, loss alignment, and other implementation-level correctness issues.

Related Interview Questions

  • Implement 1NN with NumPy - OpenAI (medium)
  • Compute entropy and implement 1-NN - OpenAI (medium)
  • Defend a Research Direction and Experiment Design - OpenAI (medium)
  • Debug MiniGPT and Backpropagate Matmul - OpenAI (medium)
  • Implement Backprop for a Tiny Network - OpenAI (hard)
OpenAI logo
OpenAI
Jan 21, 2026, 12:00 AM
Machine Learning Engineer
Onsite
Machine Learning
147
0

You are given a small Transformer model implementation (e.g., in PyTorch) plus a tiny training script. The code executes, but the model does not match a reference implementation: unit tests that check (1) the forward-pass output for a fixed input/seed and (2) the training loss for one step either fail or are inconsistent.

Task: Debug the model so that it runs end-to-end and matches the expected outputs/loss. The buggy code contains multiple independent issues, including:

  1. An error in the attention mask (shape/broadcasting or causal/padding masking is applied incorrectly).
  2. Incorrect parameter initialization (some weights are initialized with the wrong distribution/scale or not initialized at all).
  3. A bug in the loss computation due to misaligned positions (e.g., logits/labels are shifted incorrectly for next-token prediction).
  4. One additional hidden bug of similar difficulty (e.g., wrong softmax dimension, missing attention scaling by sqrt(d_k), wrong dtype/device handling, dropout/eval-mode misuse, or an off-by-one in sequence lengths).

Explain how you would systematically find and fix these issues, and what the correct implementations should look like.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More OpenAI•More Machine Learning Engineer•OpenAI Machine Learning Engineer•OpenAI Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.