This question evaluates a candidate's understanding of weight initialization in deep neural networks, assessing competencies in training dynamics such as symmetry breaking and the mitigation of vanishing or exploding activations and gradients.
Explain why weight initialization matters in deep neural networks.
Then describe common initialization methods (such as random normal/uniform, Xavier/Glorot, and He initialization):
Login required