Explain core ML concepts and metrics
Company: Amazon
Role: Data Scientist
Category: Machine Learning
Difficulty: easy
Interview Round: Technical Screen
You are interviewing for a **Data Scientist** role. Answer the following ML fundamentals questions clearly and concisely.
### Concepts
1. Explain the **bias–variance tradeoff**. How does it relate to **overfitting vs. underfitting**?
2. What are common forms of **regularization** (e.g., L1/L2, early stopping)? What problem does regularization solve?
### Imbalanced classification
3. For **imbalanced datasets**, which evaluation metrics do you prefer and why? Compare **accuracy, precision, recall, F1, PR-AUC, ROC-AUC**.
4. Define **precision** and **recall** and provide their formulas. When would you optimize for one over the other?
### Logistic regression / probabilities
5. In **logistic regression**, what is the model’s **raw output** before converting to a probability? How is it mapped to a probability?
### ROC-AUC interpretation
6. What does it mean if **ROC-AUC = 0.8**? Provide an intuitive interpretation and at least one caveat.
### Models
7. What are **ensemble models** and why do they often outperform a single model?
8. For **tree-based models** (decision trees / random forests / gradient boosting), name key **hyperparameters** and describe how they affect bias/variance.
### Output activations
9. Compare **sigmoid vs. softmax**: when do you use each, and how do their outputs differ?
Quick Answer: This question evaluates understanding of core machine learning concepts and competencies such as the bias–variance tradeoff, regularization, evaluation metrics for imbalanced classification (accuracy, precision, recall, F1, PR-AUC, ROC-AUC), logistic regression probabilities, ensemble methods, tree hyperparameters, and output activations (sigmoid vs softmax). It is commonly asked in technical interviews for Machine Learning and Data Scientist roles to assess reasoning about model behavior, metric selection, trade-offs and interpretability, testing both conceptual understanding and practical application of model evaluation and tuning.