This question evaluates understanding of 2D convolution mechanics and practical low-level tensor manipulation, including kernel shape handling, padding and stride effects, multi-channel input/output interactions, and optional bias integration.
Implement the forward pass of a 2D convolution (conv2d) from scratch (no deep learning libraries).
You are given:
x
with shape
(N, C_in, H, W)
(NCHW layout)
w
with shape
(C_out, C_in, K_h, K_w)
b
with shape
(C_out,)
(may be
None
)
s_h, s_w
p_h, p_w
(zero-padding applied to height/width)
Compute the output tensor y with shape (N, C_out, H_out, W_out) where:
For each output element:
where x_pad is x padded with zeros by (p_h, p_w).
y
as a dense numeric array/tensor.
b is None
, non-square kernels, and different strides/padding.
If x is (1, 3, 32, 32), w is (8, 3, 3, 3), stride (1,1), padding (1,1), then output shape is (1, 8, 32, 32).