You are given a binary classification dataset {(x_i, y_i)}_{i=1}^m with labels y_i ∈ {0, 1}. The model uses the sigmoid function σ(z) = 1/(1 + e^{-z}) and a linear score z_i = w^T x_i + b.
Answer the following:
(a) Write the exact maximum-likelihood optimization objective (state clearly whether it is a maximization or a minimization) both without and with L2 regularization of strength λ ≥ 0.
(b) Write the explicit negative log-likelihood (cross-entropy) L(w, b).
(c) Derive the gradients ∂L/∂w and ∂L/∂b.
(d) Clarify the distinction among "objective", "loss", and "regularized objective" in this context.
Login required