Solve Seven Python Image and Data Tasks
Company: Luma
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This multi-part Python assessment evaluates image and data processing competencies including array and matrix manipulation (center crop, rotation), numerical stability and probabilistic normalization (softmax), Gaussian kernel construction for blurring, and basic algorithmic/geometry checks (triangle inequality).
Part 1: Center Crop
Constraints
- 1 <= H, W
- 1 <= crop_height <= H
- 1 <= crop_width <= W
- H - crop_height is even
- W - crop_width is even
- Do not use NumPy
Examples
Input: ([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]], 2, 2)
Expected Output: [[6, 7], [10, 11]]
Explanation: Removing one row from the top and bottom and one column from the left and right leaves the center 2 x 2 block.
Input: ([[1, 2, 3], [4, 5, 6]], 2, 3)
Expected Output: [[1, 2, 3], [4, 5, 6]]
Explanation: The requested crop is the full image.
Input: ([[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14, 15, 16, 17, 18], [19, 20, 21, 22, 23, 24], [25, 26, 27, 28, 29, 30]], 3, 2)
Expected Output: [[9, 10], [15, 16], [21, 22]]
Explanation: The crop starts at row 1 and column 2.
Input: ([[42]], 1, 1)
Expected Output: [[42]]
Explanation: Single-pixel image edge case.
Hints
- Compute the first row and first column of the crop using integer division.
- List slicing can copy each selected row segment into the returned image.
Part 2: Rotate Image 90 Degrees Clockwise
Constraints
- 1 <= H, W
- All rows have the same length
- The input image should not be mutated
- Direct rotation helpers such as numpy.rot90 are not needed
Examples
Input: ([[1, 2, 3], [4, 5, 6]],)
Expected Output: [[4, 1], [5, 2], [6, 3]]
Explanation: The 2 x 3 image becomes a 3 x 2 image.
Input: ([[1, 2, 3], [4, 5, 6], [7, 8, 9]],)
Expected Output: [[7, 4, 1], [8, 5, 2], [9, 6, 3]]
Explanation: Standard square matrix clockwise rotation.
Input: ([[1, 2, 3, 4]],)
Expected Output: [[1], [2], [3], [4]]
Explanation: A single row becomes a single column.
Input: ([[1], [2], [3]],)
Expected Output: [[3, 2, 1]]
Explanation: A single column becomes a single row.
Hints
- In the rotated image, each new row corresponds to one original column.
- For a clockwise rotation, read each original column from bottom to top.
Part 3: Softmax
Constraints
- 0 <= len(z) <= 100000
- Each z[i] is a finite float
- Do not use NumPy, PyTorch, or similar libraries
- Use a numerically stable implementation
Examples
Input: ([1.0, 2.0, 3.0],)
Expected Output: [0.09003057317038046, 0.24472847105479764, 0.6652409557748218]
Explanation: The largest logit receives the highest probability.
Input: ([1000.0, 1000.0],)
Expected Output: [0.5, 0.5]
Explanation: A stable implementation handles large equal logits without overflow.
Input: ([-1.0],)
Expected Output: [1.0]
Explanation: A single element always has probability 1.
Input: ([],)
Expected Output: []
Explanation: Empty input edge case.
Input: ([0.0, 0.0, 0.0],)
Expected Output: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
Explanation: Equal logits produce a uniform distribution.
Hints
- Subtracting the maximum logit from every logit does not change the final probabilities.
- After shifting, exponentials are at most exp(0), which avoids overflow.
Part 4: Gaussian Blur Kernel
Constraints
- 1 <= k <= 101
- k is odd
- sigma > 0
- Return floating-point values
- The center cell has offset 0, 0
Examples
Input: (1, 1.0)
Expected Output: [[1.0]]
Explanation: A 1 x 1 kernel normalizes to a single value of 1.
Input: (3, 1.0)
Expected Output: [[0.07511360795411151, 0.12384140315297396, 0.07511360795411151], [0.12384140315297396, 0.2041799555716581, 0.12384140315297396], [0.07511360795411151, 0.12384140315297396, 0.07511360795411151]]
Explanation: A symmetric 3 x 3 Gaussian kernel with sigma 1.
Input: (3, 0.5)
Expected Output: [[0.011343736558495071, 0.08381950580221058, 0.011343736558495071], [0.08381950580221058, 0.6193470305571772, 0.08381950580221058], [0.011343736558495071, 0.08381950580221058, 0.011343736558495071]]
Explanation: A smaller sigma concentrates more weight in the center.
Hints
- Let center = k // 2. For row r and column c, offsets are y = r - center and x = c - center.
- The constant factor in the Gaussian formula is allowed, but normalization will make the final sum equal to 1 either way.
Part 5: Triangle From Adjacent Elements
Constraints
- 3 <= len(arr) <= 1000
- 1 <= arr[i] <= 1000000000
- A triangle requires a + b > c, a + c > b, and b + c > a
Examples
Input: ([1, 2, 2, 4],)
Expected Output: [1, 0]
Explanation: 1, 2, 2 forms a triangle, but 2, 2, 4 is degenerate.
Input: ([2, 10, 2, 10, 2],)
Expected Output: [0, 1, 0]
Explanation: Only the middle triple 10, 2, 10 can form a triangle.
Input: ([1000000000, 1000000000, 1000000000, 1000000000],)
Expected Output: [1, 1]
Explanation: Large equal side lengths form valid triangles.
Input: ([1, 1, 2],)
Expected Output: [0]
Explanation: Minimum length array with a degenerate triple.
Input: ([3, 4, 5],)
Expected Output: [1]
Explanation: A classic valid triangle.
Hints
- For positive side lengths, sorting the three values lets you check only whether the two smaller sides sum to more than the largest.
- Be careful with degenerate triples such as 1, 1, 2; equality is not enough.
Part 6: Minimum Euclidean Distance
Constraints
- 2 <= len(p) <= 20000
- Each point has exactly two integer coordinates
- abs(coordinate) <= 10000000
- Points may have duplicate coordinates
Examples
Input: ([[0, 11], [-7, 1], [-5, -3]],)
Expected Output: 4.47213595499958
Explanation: The closest pair is [-7, 1] and [-5, -3], with distance sqrt(20).
Input: ([[1, 1], [1, 1], [5, 5]],)
Expected Output: 0.0
Explanation: Duplicate points have distance 0 even though their indices differ.
Input: ([[0, 0], [3, 4]],)
Expected Output: 5.0
Explanation: Minimum-size input with exactly two points.
Input: ([[0, 0], [0, 2], [2, 0], [2, 2]],)
Expected Output: 2.0
Explanation: Several neighboring pairs are tied at distance 2.
Input: ([[-10000000, -10000000], [10000000, 10000000], [-9999999, -10000000]],)
Expected Output: 1.0
Explanation: Boundary coordinate values with a closest horizontal distance of 1.
Hints
- Sort points by x-coordinate and use a divide-and-conquer closest-pair algorithm.
- When merging two halves, only points close to the dividing line can improve the answer.
Part 7: Impute Missing Values and Normalize
Constraints
- 2 <= number of rows <= 50
- 1 <= number of columns <= 50
- All rows have the same number of columns
- Values are positive floats, except 0 marks missing values
- Each column has at least two distinct non-zero values
- Use population standard deviation
Examples
Input: ([[1.0, 2.0, 0.0], [0.0, 1.0, 1.0], [5.0, 6.0, 5.0]],)
Expected Output: [[-1.224744871391589, -0.4629100498862757, 0.0], [0.0, -0.9258200997725514, -1.224744871391589], [1.224744871391589, 1.3887301496588271, 1.224744871391589]]
Explanation: Zeros are replaced by column means, then each column is standardized.
Input: ([[1.0], [3.0]],)
Expected Output: [[-1.0], [1.0]]
Explanation: Minimum rows and one-column edge case with no missing values.
Input: ([[0.0, 2.0], [4.0, 0.0], [6.0, 8.0], [0.0, 10.0]],)
Expected Output: [[0.0, -1.585187847], [-1.414213562373095, 0.0], [1.414213562373095, 0.452910814], [0.0, 1.132277034]]
Explanation: Tests imputation in both columns, including multiple missing values in one column.
Input: ([[1.0, 10.0], [2.0, 20.0], [3.0, 30.0]],)
Expected Output: [[-1.224744871391589, -1.224744871391589], [0.0, 0.0], [1.224744871391589, 1.224744871391589]]
Explanation: No missing values; only column normalization is performed.
Hints
- Compute imputation means from the original non-zero values before replacing any zeros.
- After imputation, compute each column mean and standard deviation independently.