Implement K-Means and Detect Divisible Subarrays
Company: Microsoft
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to implement an iterative clustering algorithm (K-means) and to reason about subarray sums modulo k, testing competencies in numerical computation, centroid management, edge-case handling, and efficient algorithm design.
Part 1: Implement K-Means Clustering
Constraints
- 1 <= n <= 500
- 1 <= d <= 20
- 1 <= k <= n
- 0 <= max_iterations <= 100
- 0.0 <= tolerance
- All points have the same dimension
Examples
Input: ([[0, 0], [0, 2], [10, 10], [10, 12]], 2, 10, 0.0)
Expected Output: ([[0.0, 1.0], [10.0, 11.0]], [0, 0, 1, 1])
Explanation: After a few iterations, the first two points form one cluster and the last two form the other.
Input: ([[0], [2], [1]], 2, 10, 0.0)
Expected Output: ([[0.5], [2.0]], [0, 1, 0])
Explanation: Point [1] is initially tied between the two centroids and must choose the smaller index.
Input: ([[1], [1], [10]], 2, 10, 0.0)
Expected Output: ([[10.0], [1.0]], [1, 1, 0])
Explanation: Because the initial centroids are duplicates, one cluster becomes empty in the first iteration and its centroid stays unchanged.
Input: ([[5, 5]], 1, 5, 0.0)
Expected Output: ([[5.0, 5.0]], [0])
Explanation: Single-point edge case.
Hints
- Write one helper to assign points to centroids, and another step to average points inside each cluster.
- When a cluster gets no points, do not move that centroid for that iteration.
Part 2: Detect a Divisible-Sum Subarray
Constraints
- 1 <= len(nums) <= 100000
- 0 <= nums[i] <= 1000000000
- 1 <= k <= 1000000000
Examples
Input: ([23, 2, 4, 6, 7], 6)
Expected Output: True
Explanation: The subarray [2, 4] sums to 6, which is divisible by 6.
Input: ([23, 2, 6, 4, 7], 13)
Expected Output: False
Explanation: No contiguous subarray of length at least 2 has sum divisible by 13.
Input: ([0, 0], 5)
Expected Output: True
Explanation: The subarray [0, 0] has sum 0, and 0 is divisible by any positive k.
Input: ([5], 5)
Expected Output: False
Explanation: A single-element subarray is not allowed because the required length is at least 2.
Hints
- If two prefix sums have the same remainder modulo k, the sum between them is divisible by k.
- Store the earliest index where each remainder appears so you can enforce subarray length at least 2.