PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Coding & Algorithms/Microsoft

Implement K-means clustering from scratch

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in clustering algorithms, numerical computation, and practical algorithm implementation, focusing on core K-means concepts like distance-based grouping and centroid estimation. It is commonly asked because it exposes practical implementation ability, convergence reasoning, numerical edge-case handling (e.g.

  • medium
  • Microsoft
  • Coding & Algorithms
  • Machine Learning Engineer

Implement K-means clustering from scratch

Company: Microsoft

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

## Task: Implement K-means clustering (from scratch) Write a function to perform **K-means clustering** on a set of points. ### Input - A dataset `X` with shape `(n_samples, n_features)` (e.g., a NumPy array or PyTorch tensor). - An integer `k` = number of clusters. - Optional parameters: - `max_iters` (e.g., 100) - `tol` for convergence (e.g., 1e-4) - Initialization method (e.g., random points from `X`) ### Output Return: - `centroids`: shape `(k, n_features)` - `labels`: length `n_samples`, where each label is in `[0, k-1]` ### Requirements / Discussion Points - Describe and implement the two core steps iteratively: 1. **Assignment step**: assign each point to the nearest centroid (e.g., Euclidean distance). 2. **Update step**: recompute each centroid as the mean of assigned points. - Define a **stopping condition** (e.g., centroid shift < `tol` or `max_iters` reached). - Handle edge cases (at least verbally), e.g.: - A cluster gets **no points** in an iteration (empty cluster). - `k > n_samples`. - You may use **Python with NumPy and/or PyTorch**, but do not call a library K-means implementation. *(In the interview, correctness and reasoning matter more than making it fully runnable.)*

Quick Answer: This question evaluates competency in clustering algorithms, numerical computation, and practical algorithm implementation, focusing on core K-means concepts like distance-based grouping and centroid estimation. It is commonly asked because it exposes practical implementation ability, convergence reasoning, numerical edge-case handling (e.g.

Related Interview Questions

  • Sort Three Categories In Place - Microsoft (medium)
  • Implement K-Means and Detect Divisible Subarrays - Microsoft (medium)
  • Implement SFT Sample Packing - Microsoft (medium)
  • Implement SQL Table and DNA Ordering - Microsoft (medium)
  • Solve power jumps and graph tour - Microsoft (hard)
Microsoft logo
Microsoft
Feb 9, 2026, 12:00 AM
Machine Learning Engineer
Technical Screen
Coding & Algorithms
1
0
Loading...

Task: Implement K-means clustering (from scratch)

Write a function to perform K-means clustering on a set of points.

Input

  • A dataset X with shape (n_samples, n_features) (e.g., a NumPy array or PyTorch tensor).
  • An integer k = number of clusters.
  • Optional parameters:
    • max_iters (e.g., 100)
    • tol for convergence (e.g., 1e-4)
    • Initialization method (e.g., random points from X )

Output

Return:

  • centroids : shape (k, n_features)
  • labels : length n_samples , where each label is in [0, k-1]

Requirements / Discussion Points

  • Describe and implement the two core steps iteratively:
    1. Assignment step : assign each point to the nearest centroid (e.g., Euclidean distance).
    2. Update step : recompute each centroid as the mean of assigned points.
  • Define a stopping condition (e.g., centroid shift < tol or max_iters reached).
  • Handle edge cases (at least verbally), e.g.:
    • A cluster gets no points in an iteration (empty cluster).
    • k > n_samples .
  • You may use Python with NumPy and/or PyTorch , but do not call a library K-means implementation.

(In the interview, correctness and reasoning matter more than making it fully runnable.)

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Microsoft•More Machine Learning Engineer•Microsoft Machine Learning Engineer•Microsoft Coding & Algorithms•Machine Learning Engineer Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.