Rapid ML Fundamentals Check — Recommender Systems Context
You are interviewing for a data-science role on an e‑commerce recommendation platform. The hiring manager wants quick, accurate explanations that cover definitions, math intuition, computational complexity, evaluation metrics, and practical mitigation strategies. Keep answers concise but precise, referencing equations or pseudocode where helpful.
Questions
-
Compare decision trees and random forests.
-
Explain L1 vs L2 regularization and how each combats overfitting or underfitting.
-
With one million samples, would you choose a deep neural network (DNN) or KNN? Why?
-
Is the ROC curve defined only for binary classification? How would you plot one from a list of scores?
-
What causes training-loss oscillations and how would you address them?
-
Define data drift and describe how you would detect it in production.
-
Differentiate convex and non-convex objective functions.
-
Where do vanishing gradients typically occur in a neural network and how can you mitigate them?
-
How does increasing decision-tree depth impact inference time (linear, logarithmic, exponential)?
-
Cross-validation vs train_test_split – which is more robust and why?
-
Summarize the key ideas behind CNNs.
-
Contrast transformer encoders and decoders.
-
Explain the k-means algorithm and its assumptions.
-
What is the numeric range of cosine similarity?
-
Is logistic regression a generative or discriminative model?
-
Interpret a confusion matrix and discuss when to use ensemble methods.
-
Compare Naïve Bayes with KNN.
-
List common regularization techniques beyond L1/L2.
-
Gradient Boosting Machines vs Random Forests: strengths and weaknesses.
-
What does model calibration mean and how is it evaluated?
-
Describe the learning-to-rank problem setting.