Scenario
You need to cluster users to discover meaningful groups (e.g., communities, interest groups, or usage segments). You may have:
-
Traditional tabular features per user (usage frequency, demographics, embeddings, etc.), and/or
-
A
social network graph
(nodes = users, edges = friendships/follows/messages).
Questions
-
What clustering algorithm(s) would you consider, and why?
-
What are the key differences between
traditional clustering
(feature-vector based) and
social network / graph clustering
?
-
How would you evaluate cluster quality and choose the number of clusters?
-
What practical issues arise at scale (millions of users), and how would you handle them?