Explain KNN and how to tune it
Company: Microsoft
Role: Data Scientist
Category: Machine Learning
Difficulty: easy
Interview Round: Technical Screen
## K-Nearest Neighbors (KNN) fundamentals
You are interviewing for a Data Scientist role.
1. **Explain how the KNN algorithm works** for both classification and regression.
2. What are the key **hyperparameters** and design choices?
- Choice of **K**
- **Distance metric** (e.g., Euclidean, Manhattan, cosine)
- **Weighting** (uniform vs distance-weighted neighbors)
3. What **data preprocessing** is important for KNN and why? (e.g., feature scaling, handling missing values, categorical encoding)
4. Discuss the main **strengths, weaknesses, and failure modes** of KNN.
- Consider **class imbalance**, **high dimensionality**, and **large datasets**.
5. How would you **select K** and evaluate the model? Include at least one approach for avoiding overfitting.
Optionally: Explain how **dimensionality reduction (e.g., PCA)** could help KNN and when it might hurt.
Quick Answer: This question evaluates understanding of the K-Nearest Neighbors algorithm and related competencies such as hyperparameter selection (K, distance metric, weighting), data preprocessing impacts, failure modes in high-dimensional or imbalanced settings, and model evaluation strategies.