Answer the following:
(a) Gradient-boosted decision trees: How does maximum tree depth affect bias/variance, overfitting risk, and training/inference cost? How would you choose it in practice?
(b) Neural networks: Compare L1 vs L2 regularization and weight decay — how do they modify the objective, gradients, and learned parameters?
(c) Dropout: After applying dropout during training, what should happen at inference time, and why?
(d) Training vs inference: Define and contrast these phases for ML models, including data flows, randomness, and performance considerations.
Login required