PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/ML System Design/Luma AI

Debug MNIST denoiser training

Last updated: Jun 27, 2026

Quick Overview

Debug MNIST denoiser training evaluates ML product requirements, data/labeling, modeling, serving architecture, evaluation, monitoring, and trade-offs in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • medium
  • Luma AI
  • ML System Design
  • Machine Learning Engineer

Debug MNIST denoiser training

Company: Luma AI

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: medium

Interview Round: Technical Screen

How would you debug and fix a Colab notebook that trains a denoising neural network on MNIST so that (a) the training loss steadily decreases and (b) the evaluation loss is close to the training loss? Specifically: ( 1) Data: detect and correct a train/test distribution mismatch so the test set covers all digits 0–9, and ensure identical normalization statistics are applied to training and test data; ( 2) Model: remove or replace an inappropriate final ReLU so the output range supports negative values typical of the denoised signal; ( 3) Optimization: add the missing optimizer.zero_grad() and put backward() and optimizer.step() in the correct order. Describe the sanity checks, assertions, metrics, and minimal code changes you would use to validate each fix, and show the key code snippets.

Quick Answer: Debug MNIST denoiser training evaluates ML product requirements, data/labeling, modeling, serving architecture, evaluation, monitoring, and trade-offs in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

|Home/ML System Design/Luma AI

Debug MNIST denoiser training

Luma AI logo
Luma AI
Jul 17, 2025, 12:00 AM
mediumMachine Learning EngineerTechnical ScreenML System Design
80
0

Debug MNIST denoiser training

Debugging a Colab Denoising Network on MNIST

Goal: Make a Colab notebook that trains a denoising neural network on MNIST such that:

  • (a) the training loss steadily decreases, and
  • (b) the evaluation loss is close to the training loss.

You should identify and fix issues in three areas and describe how you validate each fix:

1) Data

  • Detect and correct a train/test distribution mismatch so the test set covers all digits 0–9.
  • Ensure the same normalization statistics (mean and std) computed from the training set are applied to both training and test data.

2) Model

  • Remove or replace an inappropriate final ReLU so the output range supports negative values typical of the denoised signal.

3) Optimization

  • Add the missing optimizer.zero_grad() and put backward() and optimizer.step() in the correct order.

For each of the three sections, describe:

  • Sanity checks and assertions you would add.
  • Minimal code changes and key code snippets.
  • Metrics you would track to validate the fix.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
  • State explicit assumptions before making sizing or architecture decisions.
  • Prioritize the functional path first, then address reliability, security, observability, and rollout.

What a Strong Answer Covers

  • A scoped requirements summary with concrete non-goals and success metrics.
  • ML-specific data, model, evaluation, serving, and monitoring choices.
  • Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
  • A validation, monitoring, migration, and launch plan appropriate for the risk level.

Follow-up Questions

  • What breaks first at 10x traffic or data volume?
  • How would you degrade gracefully during dependency failures?
  • What metrics and alerts would prove the design is healthy after launch?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Luma AI•More Machine Learning Engineer•Luma AI Machine Learning Engineer•Luma AI ML System Design•Machine Learning Engineer ML System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.