Implement and derive backprop from scratch

Q: Implement and derive backprop from scratch

This is a Machine Learning interview question from Anthropic for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Tiny Neural Network (From First Principles): Binary Classification

Context

You will implement and analyze a minimal neural network for binary classification with one hidden layer. Assume a dataset with features X ∈ R^{N×D} and labels y ∈ {0,1}^N. The network has:

Hidden layer: H units with an activation (ReLU or tanh).
Output layer: 1 unit with a sigmoid for P(y=1|x).

Use vectorized NumPy (or similar) without autograd.

Tasks

Forward pass
- Define shapes: W1 ∈ R^{D×H}, b1 ∈ R^{H}, W2 ∈ R^{H×1}, b2 ∈ R^{1}.
- Compute z1 = XW1 + b1, a1 = f(z1), z2 = a1W2 + b2, p = σ(z2).
Loss (numerically stable)
- Implement binary cross-entropy. Use a stable formulation (e.g., softplus: log(1+exp(x)) or log-sum-exp) to avoid overflow/underflow.
Backward pass (analytic gradients; no autograd)
- Derive and implement gradients for W1, b1, W2, b2.
Optimization
- Implement gradient descent updates for all parameters.
Gradient checking
- Verify gradients by finite differences: g_num ≈ (L(θ+ε) − L(θ−ε)) / (2ε). Report relative errors.
Discussion
- Numerical stability (sigmoid/logistic loss, softplus/log-sum-exp, log1p, expm1, clipping).
- Initialization (He vs Xavier; biases).
- Activation choices (ReLU, tanh, sigmoid; pros/cons).
- Batch size and gradient variance; learning-rate scaling.

Deliverables

Clean, vectorized code for forward, loss, backward, training loop, and gradient check.
Short written derivations and notes on the topics above.

Implement and derive backprop from scratch

Tiny Neural Network (From First Principles): Binary Classification

Context

Tasks

Deliverables

Solution (Locked)

Comments (0)