Choose clustering vs regression; explain KNN

Q: Choose clustering vs regression; explain KNN

This is a Machine Learning interview question from Thumbtack for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

When would you use clustering vs. regression on a business problem with partially labeled outcomes? Specify the decision criteria (label availability, objective, evaluation metrics, cost of errors). Enumerate at least four clustering algorithms (K-Means, Hierarchical/Agglomerative, DBSCAN/HDBSCAN, Gaussian Mixture Models) and compare assumptions, key hyperparameters, scalability, distance metrics, and failure modes (e.g., non-spherical clusters, varying density, high-dimensional sparsity, mixed data types). Give concrete scenarios selecting DBSCAN over K-Means and vice versa. Finally, explain K-Nearest Neighbors to a non-technical stakeholder with a real-world analogy, then deepen: choosing k, weighting by distance, effects of feature scaling, curse of dimensionality, and how to deploy KNN efficiently (KD-tree/ball-tree, approximate neighbors).

Choose clustering vs regression; explain KNN

Comments (0)