Explain key ML metrics and techniques
Company: Meta
Role: Software Engineer
Category: Machine Learning
Difficulty: medium
Interview Round: Technical Screen
You are asked a set of short conceptual machine learning questions.
1. **Confusion matrix and metrics**
For a binary classification problem:
- Define the entries of the confusion matrix: true positive (TP), false positive (FP), true negative (TN), and false negative (FN).
- Using TP, FP, TN, FN, write formulas for accuracy, precision, recall, and (optionally) F1-score.
- Briefly explain in words what precision and recall each measure.
2. **Ensemble learning**
- What is ensemble learning?
- Why can combining multiple base models into an ensemble improve performance?
- Briefly describe common ways to combine model outputs.
3. **Bagging vs. boosting**
Compare bagging and boosting along these dimensions:
- How each method constructs training sets and trains base learners.
- Whether each method primarily reduces bias, variance, or both.
- The main advantages and disadvantages of each.
- Name at least one common algorithm that uses bagging and one that uses boosting.
4. **L1 vs. L2 regularization**
Consider a supervised learning model with loss function `L(w)` over parameters `w` and a regularization term with strength `λ` (lambda):
- Write the objective for L1-regularized training and L2-regularized training.
- Explain how L1 and L2 regularization each affect the learned parameters (e.g., sparsity vs. shrinkage).
- Discuss when you might prefer L1 over L2, and vice versa.
5. **Two-layer neural network forward pass**
Consider a simple two-layer feedforward neural network: input → hidden layer → output layer.
- Let the input vector be `x`. The hidden layer uses weight matrix `W1` and bias vector `b1` with activation function `g` applied elementwise.
- The output layer uses weight matrix `W2` and bias vector `b2` with activation function `f` (e.g., identity, sigmoid, or softmax).
(a) Write the mathematical expressions for the hidden activations and final output in terms of `x`, `W1`, `b1`, `W2`, `b2`, `g`, and `f`.
(b) Briefly describe how you would carry out a concrete numerical computation of the network output given specific numeric values for these quantities.
Quick Answer: This question evaluates understanding of core Machine Learning concepts including classification evaluation metrics, ensemble methods (bagging vs. boosting), regularization (L1 vs. L2), and two-layer neural network forward computation, testing both model-evaluation and model-building competencies.