ML Fundamentals
Answer the following conceptual questions:
-
Learning rate vs. training stability
: Why can training metrics (loss/accuracy)
fluctuate or oscillate
when the learning rate is too large? What happens when it is too small?
-
Vanishing gradients in fully connected networks
: In a deep fully connected network trained with backpropagation, are
vanishing gradients
more likely to affect layers closer to the
input
or closer to the
output
? Explain why, and name common mitigations.