PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Meta

Which clustering algorithm would you use and why

Last updated: Jun 15, 2026

Quick Overview

A Meta Data Scientist machine learning screen on choosing a clustering algorithm for social-product users. It contrasts traditional feature-vector clustering (k-means, GMM, hierarchical, DBSCAN/HDBSCAN) with social-graph community detection (Louvain/Leiden, spectral, SBM, node embeddings), and covers preprocessing, choosing the number of clusters, evaluation, directed/weighted graphs, and scaling to millions of users.

  • medium
  • Meta
  • Machine Learning
  • Data Scientist

Which clustering algorithm would you use and why

Company: Meta

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

##### Question You need to cluster users for a social product (e.g. Meta) to discover meaningful groups such as communities, interest groups, or usage segments. The data you have may be either, or both, of: - A **user feature table** — dense numeric/categorical features per user (age bucket, country, activity rate, topics engaged, embeddings, etc.). - A **social network graph** — nodes = users, edges = friendships / follows / messages / interactions, possibly **weighted and directed**. Answer the following: 1. **Traditional (feature-vector) clustering.** Which clustering algorithms would you consider (e.g. k-means, GMM, hierarchical, DBSCAN/HDBSCAN) and how would you choose among them? Describe preprocessing, distance/similarity choices, how you would pick the number of clusters, and how you would evaluate cluster quality. 2. **Social network / graph clustering.** If the core data is a social graph instead, what algorithms would you use for community detection, and how does this differ fundamentally from clustering a feature matrix? 3. **Directed and weighted graphs.** How do you handle direction and edge weights in graph clustering? 4. **Hybrid.** How would you combine graph structure and user features when both are available? 5. **Choosing the number of clusters and evaluating quality.** What metrics and validation strategy would you use for both the feature-vector and the graph case? 6. **Scale and operations.** What practical issues arise at millions of users (compute, dynamic graphs, cold-start, drift) and how would you handle them?

Quick Answer: A Meta Data Scientist machine learning screen on choosing a clustering algorithm for social-product users. It contrasts traditional feature-vector clustering (k-means, GMM, hierarchical, DBSCAN/HDBSCAN) with social-graph community detection (Louvain/Leiden, spectral, SBM, node embeddings), and covers preprocessing, choosing the number of clusters, evaluation, directed/weighted graphs, and scaling to millions of users.

Related Interview Questions

  • Implement 1NN Embeddings and Forward Pass - Meta (hard)
  • Design and evaluate an ads ranking algorithm - Meta (easy)
  • How would you design a Shop Ads ranking algorithm? - Meta (easy)
  • Derive Linear Regression Solution - Meta (medium)
  • Explain key ML metrics and techniques - Meta (medium)
Meta logo
Meta
Nov 2, 2025, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
2
0
Question

You need to cluster users for a social product (e.g. Meta) to discover meaningful groups such as communities, interest groups, or usage segments. The data you have may be either, or both, of:

  • A user feature table — dense numeric/categorical features per user (age bucket, country, activity rate, topics engaged, embeddings, etc.).
  • A social network graph — nodes = users, edges = friendships / follows / messages / interactions, possibly weighted and directed .

Answer the following:

  1. Traditional (feature-vector) clustering. Which clustering algorithms would you consider (e.g. k-means, GMM, hierarchical, DBSCAN/HDBSCAN) and how would you choose among them? Describe preprocessing, distance/similarity choices, how you would pick the number of clusters, and how you would evaluate cluster quality.
  2. Social network / graph clustering. If the core data is a social graph instead, what algorithms would you use for community detection, and how does this differ fundamentally from clustering a feature matrix?
  3. Directed and weighted graphs. How do you handle direction and edge weights in graph clustering?
  4. Hybrid. How would you combine graph structure and user features when both are available?
  5. Choosing the number of clusters and evaluating quality. What metrics and validation strategy would you use for both the feature-vector and the graph case?
  6. Scale and operations. What practical issues arise at millions of users (compute, dynamic graphs, cold-start, drift) and how would you handle them?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Meta•More Data Scientist•Meta Data Scientist•Meta Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.