PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Coding & Algorithms/TikTok

Implement K-means and run two iterations

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of k-means clustering concepts (k-means++ initialization, Lloyd iterations, and SSE) and proficiency in numerical implementation and vectorized array operations.

  • Medium
  • TikTok
  • Coding & Algorithms
  • Data Scientist

Implement K-means and run two iterations

Company: TikTok

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Given points P={(0,0),(0,2),(2,0),(2,2),(8,8),(8,10),(10,8),(10,10)} and k=2, (1) initialize centroids with k-means++ using seed=42 and Euclidean distance; show the sampled centers. (2) Perform exactly two Lloyd iterations (assign → update) by hand, breaking ties by choosing the lower-index centroid; list cluster memberships and centroids after each iteration. (3) Compute SSE after the second iteration. (4) Describe how you would implement this efficiently in vectorized NumPy (no loops over points), including the time complexity per iteration, and explain how you would handle empty clusters robustly.

Quick Answer: This question evaluates understanding of k-means clustering concepts (k-means++ initialization, Lloyd iterations, and SSE) and proficiency in numerical implementation and vectorized array operations.

Related Interview Questions

  • Parse a nested list from a string - TikTok (medium)
  • Implement stacks, streaming median, and upward path sum - TikTok (easy)
  • Maximize sum with no adjacent elements - TikTok (medium)
  • Implement stack variants and path-sum check - TikTok (medium)
  • Find the longest palindromic substring - TikTok (easy)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Coding & Algorithms
2
0

Given points P={(0,0),(0,2),(2,0),(2,2),(8,8),(8,10),(10,8),(10,10)} and k=2, (1) initialize centroids with k-means++ using seed=42 and Euclidean distance; show the sampled centers. (2) Perform exactly two Lloyd iterations (assign → update) by hand, breaking ties by choosing the lower-index centroid; list cluster memberships and centroids after each iteration. (3) Compute SSE after the second iteration. (4) Describe how you would implement this efficiently in vectorized NumPy (no loops over points), including the time complexity per iteration, and explain how you would handle empty clusters robustly.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.