This question evaluates competency in clustering algorithms, numerical computation, and practical algorithm implementation, focusing on core K-means concepts like distance-based grouping and centroid estimation. It is commonly asked because it exposes practical implementation ability, convergence reasoning, numerical edge-case handling (e.g.
Write a function to perform K-means clustering on a set of points.
X
with shape
(n_samples, n_features)
(e.g., a NumPy array or PyTorch tensor).
k
= number of clusters.
max_iters
(e.g., 100)
tol
for convergence (e.g., 1e-4)
X
)
Return:
centroids
: shape
(k, n_features)
labels
: length
n_samples
, where each label is in
[0, k-1]
tol
or
max_iters
reached).
k > n_samples
.
(In the interview, correctness and reasoning matter more than making it fully runnable.)