Debug MNIST denoiser training
Company: Luma AI
Role: Machine Learning Engineer
Category: ML System Design
Difficulty: medium
Interview Round: Technical Screen
How would you debug and fix a Colab notebook that trains a denoising neural network on MNIST so that
(a) the training loss steadily decreases and
(b) the evaluation loss is close to the training loss? Specifically:
(
1) Data: detect and correct a train/test distribution mismatch so the test set covers all digits 0–9, and ensure identical normalization statistics are applied to training and test data;
(
2) Model: remove or replace an inappropriate final ReLU so the output range supports negative values typical of the denoised signal;
(
3) Optimization: add the missing optimizer.zero_grad() and put backward() and optimizer.step() in the correct order. Describe the sanity checks, assertions, metrics, and minimal code changes you would use to validate each fix, and show the key code snippets.
Quick Answer: Debug MNIST denoiser training evaluates ML product requirements, data/labeling, modeling, serving architecture, evaluation, monitoring, and trade-offs in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.