PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Explain vanishing gradients and activations

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of the vanishing gradient problem, high-level backpropagation dynamics, and the role of activation functions in gradient propagation, testing competency in neural network optimization and architecture.

  • easy
  • Amazon
  • Machine Learning
  • Machine Learning Engineer

Explain vanishing gradients and activations

Company: Amazon

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: easy

Interview Round: Technical Screen

Explain the **vanishing gradient problem** in deep neural networks. In your answer: - Describe how backpropagation works at a high level and why gradients can vanish in deep networks. - Show how the choice of **activation function** (e.g., sigmoid, tanh, ReLU) affects gradient magnitude. - Discuss common techniques (including activation choices) to mitigate vanishing gradients.

Quick Answer: This question evaluates understanding of the vanishing gradient problem, high-level backpropagation dynamics, and the role of activation functions in gradient propagation, testing competency in neural network optimization and architecture.

Related Interview Questions

  • LLM Fundamentals: Tokenization Design and KL-Regularized SFT - Amazon (medium)
  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
|Home/Machine Learning/Amazon

Explain vanishing gradients and activations

Amazon logo
Amazon
Dec 8, 2025, 8:00 PM
easyMachine Learning EngineerTechnical ScreenMachine Learning
5
0

Explain the vanishing gradient problem in deep neural networks.

In your answer:

  • Describe how backpropagation works at a high level and why gradients can vanish in deep networks.
  • Show how the choice of activation function (e.g., sigmoid, tanh, ReLU) affects gradient magnitude.
  • Discuss common techniques (including activation choices) to mitigate vanishing gradients.
Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon Machine Learning•Machine Learning Engineer Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.