This question evaluates understanding of neural network building blocks (layers and activation functions), comparative properties of activation/gating functions, principled selection of loss functions for different problem settings, and the internal mechanics and hyperparameters of the Adam optimizer.
Answer the following ML fundamentals questions:
For each of the following, explain:
Activations / gating:
Given different problem settings, which loss would you choose and why?
Explain how Adam works:
Be explicit about trade-offs, common failure modes, and practical defaults.