Answer the following ML fundamentals questions:
1) Neural network building blocks
-
What is a "layer" in a neural network, and what does it compute?
-
Why do we need
activation functions
?
2) Activation functions deep dive
For each of the following, explain:
-
its mathematical form (high level is fine)
-
typical use cases
-
pros/cons and pitfalls (e.g., saturation, dead neurons, gradient flow)
Activations / gating:
-
Sigmoid
-
ReLU
-
SiLU (a.k.a. Swish)
-
SwiGLU (or GLU-style gated activations)
3) Loss functions
Given different problem settings, which loss would you choose and why?
-
Binary classification (possibly imbalanced)
-
Multi-class classification
-
Regression with outliers
-
Learning to rank / retrieval (optional)
-
Probabilistic forecasting (optional)
4) Optimizer
Explain how Adam works:
-
the moving averages it keeps
-
bias correction
-
how the parameter update is computed
-
what the key hyperparameters do (learning rate,
β1
,
β2
,
ϵ
, weight decay)
Be explicit about trade-offs, common failure modes, and practical defaults.