Machine-Learning Deep-Dive (Data Scientist Technical Screen)
Scenario
You are discussing core ML concepts and design choices expected in a technical interview setting. Provide concise, principled answers to the following.
Questions
-
Principal Component Analysis (PCA)
-
Explain how PCA achieves dimensionality reduction.
-
Explain why L2-normalization beforehand can matter (clarify what kind of normalization and when).
-
Logistic Regression
-
Derive the gradient of the binary logistic regression loss via back-propagation/chain rule.
-
Classification Thresholding
-
When would you move the classification threshold to improve FPR or TPR? Describe the trade-offs and how you’d select a new threshold.
-
Baselines and Model Choice
-
What baseline models would you compare against, and why might you finally choose logistic regression?
-
Knowledge-Informed ML
-
Define knowledge-informed (or knowledge-guided) machine learning and give a concrete example.