Implement SGD for linear regression and derive gradients

Q: Implement SGD for linear regression and derive gradients

This question evaluates understanding of gradient-based optimization and linear model training, including competency in deriving gradients for mean squared error and implementing stochastic (or mini-batch) gradient descent for parameter estimation.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Prompt

You are given a dataset of $n$ 1D samples $\{(x_i, y_i)\}_{i=1}^n$ , where $x_i$ and $y_i$ are real numbers.

We want to fit a linear model:

$\hat{y} = a x + b$

by minimizing the mean squared error (MSE).

Tasks

Define the loss function for this problem (e.g., MSE over the dataset).
Using the chain rule / backprop-style reasoning, derive the gradients $\frac{\partial L}{\partial a}$ and $\frac{\partial L}{\partial b}$ .
Describe (and optionally write pseudocode for) how to train $a$ and $b$ using SGD (or mini-batch SGD):
- parameter initialization
- per-step gradient computation
- update rule
- learning rate choice / scheduling
- stopping criteria
Discuss common pitfalls and edge cases (e.g., scaling, divergence, choosing batch size).

Output / Expected Result

After training, return the learned parameters $a$ and $b$ that approximately minimize the chosen loss on the provided data.

Implement SGD for linear regression and derive gradients

Prompt

Tasks

Output / Expected Result

Solution

Comments (0)

Implement SGD for linear regression and derive gradients

Overview

Prompt

Tasks

Output / Expected Result

Solution

Comments (0)