How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a easy difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Tesla.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Tesla during technical interviews.

Implement 2D convolution forward pass

Company: Tesla

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

## Problem Implement the **forward pass of a 2D convolution (conv2d)** from scratch (no deep learning libraries). You are given: - Input tensor `x` with shape **(N, C_in, H, W)** (NCHW layout) - Filter weights `w` with shape **(C_out, C_in, K_h, K_w)** - Optional bias `b` with shape **(C_out,)** (may be `None`) - Integer stride `s_h, s_w` - Integer padding `p_h, p_w` (zero-padding applied to height/width) Compute the output tensor `y` with shape **(N, C_out, H_out, W_out)** where: \[ H_{out} = \left\lfloor \frac{H + 2p_h - K_h}{s_h} \right\rfloor + 1, \quad W_{out} = \left\lfloor \frac{W + 2p_w - K_w}{s_w} \right\rfloor + 1 \] For each output element: \[ y[n, c_{out}, i, j] = \sum_{c_{in}=0}^{C_{in}-1} \sum_{u=0}^{K_h-1} \sum_{v=0}^{K_w-1} \; x_{pad}[n, c_{in}, i\cdot s_h + u, j\cdot s_w + v] \cdot w[c_{out}, c_{in}, u, v] + b[c_{out}] \] where `x_pad` is `x` padded with zeros by `(p_h, p_w)`. ## Requirements - Return `y` as a dense numeric array/tensor. - Do not use existing convolution operators. - Handle edge cases such as `b is None`, non-square kernels, and different strides/padding. ## Constraints (typical for an interview unit test) - Sizes are small enough that an \(O(N\cdot C_{out}\cdot C_{in}\cdot H_{out}\cdot W_{out}\cdot K_h\cdot K_w)\) implementation passes. - Inputs are floating point numbers. ## Example (shape check) If `x` is `(1, 3, 32, 32)`, `w` is `(8, 3, 3, 3)`, stride `(1,1)`, padding `(1,1)`, then output shape is `(1, 8, 32, 32)`.

Quick Answer: This question evaluates understanding of 2D convolution mechanics and practical low-level tensor manipulation, including kernel shape handling, padding and stride effects, multi-channel input/output interactions, and optional bias integration.

Implement the forward pass of a 2D convolution (conv2d) from scratch, without any deep-learning library or built-in convolution operator. You are given (all tensors are plain nested Python lists of floats, NCHW layout): - `x`: input tensor of shape (N, C_in, H, W). - `w`: filter weights of shape (C_out, C_in, K_h, K_w). - `b`: bias. Either a list of length C_out, or an empty list / None meaning no bias. - `s_h`, `s_w`: integer stride along height and width. - `p_h`, `p_w`: integer zero-padding along height and width. Compute and return the output tensor `y` of shape (N, C_out, H_out, W_out) where: H_out = (H + 2*p_h - K_h) // s_h + 1 W_out = (W + 2*p_w - K_w) // s_w + 1 and for every output element: y[n][co][i][j] = sum over c_in, u, v of x_pad[n][c_in][i*s_h + u][j*s_w + v] * w[co][c_in][u][v] ( + b[co] if bias is present ) where x_pad is x zero-padded by (p_h, p_w). Do not use existing convolution operators. Handle non-square kernels, different strides/padding per axis, and the no-bias case. Example shape check: x of shape (1,3,32,32), w of shape (8,3,3,3), stride (1,1), padding (1,1) -> output shape (1,8,32,32).

Constraints

1 <= N, C_in, C_out and all spatial dimensions are small (interview-scale unit tests)
Kernel may be non-square (K_h != K_w)
Stride and padding may differ between height and width (s_h != s_w, p_h != p_w)
Bias may be absent (empty list / None) or a list of length C_out
Inputs are floating-point numbers; output must be a dense (N, C_out, H_out, W_out) tensor
Padding is zero-padding; you must not use a built-in convolution operator

Examples

Input: ([[[[1.0, 2.0], [3.0, 4.0]]]], [[[[1.0, 0.0], [0.0, 1.0]]]], None, 1, 1, 0, 0)

Expected Output: [[[[5.0]]]]

Explanation: 1x1x2x2 input, identity-diagonal 2x2 kernel, no padding/stride 1 -> single output 1*1 + 4*1 = 5.

Input: ([[[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]], [[[[1.0, 1.0], [1.0, 1.0]]]], [0.0], 1, 1, 0, 0)

Expected Output: [[[[12.0, 16.0], [24.0, 28.0]]]]

Explanation: 3x3 input, all-ones 2x2 kernel (box sum), bias 0 -> each output is the sum of a 2x2 window (e.g. top-left 1+2+4+5 = 12).

Input: ([[[[1.0, 1.0], [1.0, 1.0]]]], [[[[1.0]]]], [5.0], 1, 1, 1, 1)

Expected Output: [[[[5.0, 5.0, 5.0, 5.0], [5.0, 6.0, 6.0, 5.0], [5.0, 6.0, 6.0, 5.0], [5.0, 5.0, 5.0, 5.0]]]]

Explanation: 1x1 kernel with padding (1,1) grows a 2x2 input to 4x4; bias 5 appears everywhere, and the inner 2x2 (the real pixels) add an extra 1 to make 6.

Input: ([[[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0], [13.0, 14.0, 15.0, 16.0]]]], [[[[1.0, 0.0], [0.0, 1.0]]]], None, 2, 2, 0, 0)

Expected Output: [[[[7.0, 11.0], [23.0, 27.0]]]]

Explanation: Stride (2,2) with a diagonal 2x2 kernel on a 4x4 input -> 2x2 output; top-left window picks 1 and 6 -> 7.

Input: ([[[[1.0, 2.0], [3.0, 4.0]], [[10.0, 20.0], [30.0, 40.0]]]], [[[[1.0, 1.0], [1.0, 1.0]], [[2.0, 2.0], [2.0, 2.0]]]], [1.0], 1, 1, 0, 0)

Expected Output: [[[[211.0]]]]

Explanation: Two input channels summed: channel0 box-sum (1+2+3+4)=10, channel1 weighted by 2 (10+20+30+40)*2=200, plus bias 1 = 211.

Input: ([[[[2.0, -1.0, 3.0], [0.0, 5.0, -2.0], [1.0, 1.0, 1.0]]]], [[[[1.0, -1.0, 1.0]]]], None, 1, 1, 0, 1)

Expected Output: [[[[-3.0, 6.0, -4.0], [5.0, -7.0, 7.0], [0.0, 1.0, 0.0]]]]

Explanation: Asymmetric padding p_h=0, p_w=1 with a 1x3 kernel [1,-1,1] and no bias; preserves a 3x3 output, exercising horizontal zero-padding and negative values.

Hints

First compute the output dimensions: H_out = (H + 2*p_h - K_h) // s_h + 1 and W_out = (W + 2*p_w - K_w) // s_w + 1.
You don't have to physically build a padded input. Map each output position (i, j) back to the source index row = i*s_h + u - p_h and col = j*s_w + v - p_w, and simply skip (treat as zero) any index that falls outside [0, H) or [0, W).
Loop order n -> c_out -> i -> j, then accumulate over c_in -> u -> v. Initialize the accumulator with b[c_out] when bias is present, otherwise 0.
Watch the edge cases the unit test targets: no bias, non-square kernels, and asymmetric stride/padding.

Quick Overview

This question evaluates understanding of 2D convolution mechanics and practical low-level tensor manipulation, including kernel shape handling, padding and stride effects, multi-channel input/output interactions, and optional bias integration.