How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Amazon.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Amazon during technical interviews.

Implement PyTorch training loop | Amazon Coding Question

Implement PyTorch training loop

Company: Amazon

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

### Implement a basic PyTorch training loop You are given a PyTorch neural network model, a DataLoader that yields `(inputs, targets)` batches, an optimizer, and a loss function. Write a function `train(model, train_loader, optimizer, loss_fn, device, num_epochs)` that: - Moves the model and input batches to the specified device (CPU or GPU). - Runs for `num_epochs` epochs. - For each batch, performs a forward pass, computes the loss, runs backpropagation, and updates the model parameters. - Properly zeros gradients at the right time. - Optionally prints or returns the average training loss per epoch. Clearly show the order of operations inside the training loop (zeroing gradients, forward pass, loss computation, backward pass, optimizer step).

Quick Answer: This question evaluates practical implementation skills in PyTorch, focusing on model and device management, batch-wise training mechanics, gradient handling, and optimizer interaction.

Implement a simplified, judge-friendly version of a PyTorch training loop. Because the judge cannot serialize real PyTorch objects, use these simplified inputs: - `model`: a dictionary `{'weights': [...], 'bias': ...}` representing a single linear layer. - `train_loader`: a list of batches. Each batch is `(inputs, targets)`, where `inputs` is a list of samples and each sample is a list of feature values, and `targets` is a list of scalar labels. - `optimizer`: a dictionary `{'lr': ...}`. - `loss_fn`: always the string `'mse'`. - `device`: either `'cpu'` or `'cuda'`. Device movement is simulated; include it in the returned result. For each epoch, iterate through the batches in order and perform the standard training-loop steps: 1. Zero gradients. 2. Forward pass. 3. Compute mean squared error loss. 4. Backward pass to compute gradients. 5. Optimizer step using gradient descent. The model prediction for one sample is: `prediction = dot(weights, sample) + bias` The loss for one batch is the mean squared error: `MSE = mean((prediction - target)^2)` Return the final weights, final bias, the device string, and the average training loss for each epoch. The epoch loss is the average of that epoch's batch losses. If `train_loader` is empty, the average loss for that epoch is `0.0`. Round all returned floating-point values to 6 decimal places.

Constraints

0 <= num_epochs <= 100
1 <= len(model['weights']) <= 20
Each sample contains exactly len(model['weights']) features
0 <= total number of samples across all batches <= 10^4
Each batch is non-empty, except that `train_loader` itself may be empty
loss_fn is always 'mse'

Examples

Input: ({'weights': [0.0], 'bias': 0.0}, [([[1.0], [2.0]], [2.0, 4.0])], {'lr': 0.1}, 'mse', 'cpu', 1)

Expected Output: {'weights': [1.0], 'bias': 0.6, 'losses': [10.0], 'device': 'cpu'}

Explanation: Starting from zero, the batch predictions are [0, 0], so the batch MSE is 10.0. One gradient descent step updates the weight to 1.0 and the bias to 0.6.

Input: ({'weights': [0.0], 'bias': 0.0}, [([[1.0]], [1.0]), ([[2.0]], [2.0])], {'lr': 0.1}, 'mse', 'cuda', 2)

Expected Output: {'weights': [0.7696], 'bias': 0.4608, 'losses': [1.48, 0.039168], 'device': 'cuda'}

Explanation: This case has two epochs and two batches, so the loop must correctly repeat zero-grad, forward, backward, and step for every batch. The returned losses are the average batch losses for each epoch.

Input: ({'weights': [1.0, -1.0], 'bias': 0.5}, [], {'lr': 0.01}, 'mse', 'cpu', 2)

Expected Output: {'weights': [1.0, -1.0], 'bias': 0.5, 'losses': [0.0, 0.0], 'device': 'cpu'}

Explanation: With no batches, no parameter updates occur. By definition in this problem, each epoch's average loss is 0.0.

Input: ({'weights': [0.0, 0.0], 'bias': 0.0}, [([[1.0, 2.0], [3.0, 4.0]], [5.0, 11.0])], {'lr': 0.01}, 'mse', 'cpu', 1)

Expected Output: {'weights': [0.38, 0.54], 'bias': 0.16, 'losses': [73.0], 'device': 'cpu'}

Explanation: This verifies that gradients are computed separately for each weight in a multi-feature linear model.

Hints

Keep the training loop order strict: zero gradients, forward pass, loss computation, backward pass, then optimizer step.
For MSE on a batch of size n, the derivative with respect to each prediction is `2 * (pred - target) / n`.

Quick Overview

This question evaluates practical implementation skills in PyTorch, focusing on model and device management, batch-wise training mechanics, gradient handling, and optimizer interaction.