PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of 2D convolution mechanics and practical low-level tensor manipulation, including kernel shape handling, padding and stride effects, multi-channel input/output interactions, and optional bias integration.

  • easy
  • Tesla
  • Coding & Algorithms
  • Machine Learning Engineer

Implement 2D convolution forward pass

Company: Tesla

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

## Problem Implement the **forward pass of a 2D convolution (conv2d)** from scratch (no deep learning libraries). You are given: - Input tensor `x` with shape **(N, C_in, H, W)** (NCHW layout) - Filter weights `w` with shape **(C_out, C_in, K_h, K_w)** - Optional bias `b` with shape **(C_out,)** (may be `None`) - Integer stride `s_h, s_w` - Integer padding `p_h, p_w` (zero-padding applied to height/width) Compute the output tensor `y` with shape **(N, C_out, H_out, W_out)** where: \[ H_{out} = \left\lfloor \frac{H + 2p_h - K_h}{s_h} \right\rfloor + 1, \quad W_{out} = \left\lfloor \frac{W + 2p_w - K_w}{s_w} \right\rfloor + 1 \] For each output element: \[ y[n, c_{out}, i, j] = \sum_{c_{in}=0}^{C_{in}-1} \sum_{u=0}^{K_h-1} \sum_{v=0}^{K_w-1} \; x_{pad}[n, c_{in}, i\cdot s_h + u, j\cdot s_w + v] \cdot w[c_{out}, c_{in}, u, v] + b[c_{out}] \] where `x_pad` is `x` padded with zeros by `(p_h, p_w)`. ## Requirements - Return `y` as a dense numeric array/tensor. - Do not use existing convolution operators. - Handle edge cases such as `b is None`, non-square kernels, and different strides/padding. ## Constraints (typical for an interview unit test) - Sizes are small enough that an \(O(N\cdot C_{out}\cdot C_{in}\cdot H_{out}\cdot W_{out}\cdot K_h\cdot K_w)\) implementation passes. - Inputs are floating point numbers. ## Example (shape check) If `x` is `(1, 3, 32, 32)`, `w` is `(8, 3, 3, 3)`, stride `(1,1)`, padding `(1,1)`, then output shape is `(1, 8, 32, 32)`.

Quick Answer: This question evaluates understanding of 2D convolution mechanics and practical low-level tensor manipulation, including kernel shape handling, padding and stride effects, multi-channel input/output interactions, and optional bias integration.

Implement the forward pass of a 2D convolution (conv2d) from scratch, without any deep-learning library or built-in convolution operator. You are given (all tensors are plain nested Python lists of floats, NCHW layout): - `x`: input tensor of shape (N, C_in, H, W). - `w`: filter weights of shape (C_out, C_in, K_h, K_w). - `b`: bias. Either a list of length C_out, or an empty list / None meaning no bias. - `s_h`, `s_w`: integer stride along height and width. - `p_h`, `p_w`: integer zero-padding along height and width. Compute and return the output tensor `y` of shape (N, C_out, H_out, W_out) where: H_out = (H + 2*p_h - K_h) // s_h + 1 W_out = (W + 2*p_w - K_w) // s_w + 1 and for every output element: y[n][co][i][j] = sum over c_in, u, v of x_pad[n][c_in][i*s_h + u][j*s_w + v] * w[co][c_in][u][v] ( + b[co] if bias is present ) where x_pad is x zero-padded by (p_h, p_w). Do not use existing convolution operators. Handle non-square kernels, different strides/padding per axis, and the no-bias case. Example shape check: x of shape (1,3,32,32), w of shape (8,3,3,3), stride (1,1), padding (1,1) -> output shape (1,8,32,32).

Constraints

  • 1 <= N, C_in, C_out and all spatial dimensions are small (interview-scale unit tests)
  • Kernel may be non-square (K_h != K_w)
  • Stride and padding may differ between height and width (s_h != s_w, p_h != p_w)
  • Bias may be absent (empty list / None) or a list of length C_out
  • Inputs are floating-point numbers; output must be a dense (N, C_out, H_out, W_out) tensor
  • Padding is zero-padding; you must not use a built-in convolution operator

Examples

Input: ([[[[1.0, 2.0], [3.0, 4.0]]]], [[[[1.0, 0.0], [0.0, 1.0]]]], None, 1, 1, 0, 0)

Expected Output: [[[[5.0]]]]

Explanation: 1x1x2x2 input, identity-diagonal 2x2 kernel, no padding/stride 1 -> single output 1*1 + 4*1 = 5.

Input: ([[[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]], [[[[1.0, 1.0], [1.0, 1.0]]]], [0.0], 1, 1, 0, 0)

Expected Output: [[[[12.0, 16.0], [24.0, 28.0]]]]

Explanation: 3x3 input, all-ones 2x2 kernel (box sum), bias 0 -> each output is the sum of a 2x2 window (e.g. top-left 1+2+4+5 = 12).

Input: ([[[[1.0, 1.0], [1.0, 1.0]]]], [[[[1.0]]]], [5.0], 1, 1, 1, 1)

Expected Output: [[[[5.0, 5.0, 5.0, 5.0], [5.0, 6.0, 6.0, 5.0], [5.0, 6.0, 6.0, 5.0], [5.0, 5.0, 5.0, 5.0]]]]

Explanation: 1x1 kernel with padding (1,1) grows a 2x2 input to 4x4; bias 5 appears everywhere, and the inner 2x2 (the real pixels) add an extra 1 to make 6.

Input: ([[[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0], [13.0, 14.0, 15.0, 16.0]]]], [[[[1.0, 0.0], [0.0, 1.0]]]], None, 2, 2, 0, 0)

Expected Output: [[[[7.0, 11.0], [23.0, 27.0]]]]

Explanation: Stride (2,2) with a diagonal 2x2 kernel on a 4x4 input -> 2x2 output; top-left window picks 1 and 6 -> 7.

Input: ([[[[1.0, 2.0], [3.0, 4.0]], [[10.0, 20.0], [30.0, 40.0]]]], [[[[1.0, 1.0], [1.0, 1.0]], [[2.0, 2.0], [2.0, 2.0]]]], [1.0], 1, 1, 0, 0)

Expected Output: [[[[211.0]]]]

Explanation: Two input channels summed: channel0 box-sum (1+2+3+4)=10, channel1 weighted by 2 (10+20+30+40)*2=200, plus bias 1 = 211.

Input: ([[[[2.0, -1.0, 3.0], [0.0, 5.0, -2.0], [1.0, 1.0, 1.0]]]], [[[[1.0, -1.0, 1.0]]]], None, 1, 1, 0, 1)

Expected Output: [[[[-3.0, 6.0, -4.0], [5.0, -7.0, 7.0], [0.0, 1.0, 0.0]]]]

Explanation: Asymmetric padding p_h=0, p_w=1 with a 1x3 kernel [1,-1,1] and no bias; preserves a 3x3 output, exercising horizontal zero-padding and negative values.

Hints

  1. First compute the output dimensions: H_out = (H + 2*p_h - K_h) // s_h + 1 and W_out = (W + 2*p_w - K_w) // s_w + 1.
  2. You don't have to physically build a padded input. Map each output position (i, j) back to the source index row = i*s_h + u - p_h and col = j*s_w + v - p_w, and simply skip (treat as zero) any index that falls outside [0, H) or [0, W).
  3. Loop order n -> c_out -> i -> j, then accumulate over c_in -> u -> v. Initialize the accumulator with b[c_out] when bias is present, otherwise 0.
  4. Watch the edge cases the unit test targets: no bias, non-square kernels, and asymmetric stride/padding.
Last updated: Jun 26, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Generate Per-Position Guess Feedback - Tesla (easy)
  • Write SQL Data Transformation Queries - Tesla (medium)
  • Implement a Rollback Key-Value Store - Tesla (hard)
  • Compute suffix sums over waypoints - Tesla (hard)
  • Compute time to burn tree - Tesla (medium)