Technical ML Phone Screen: Preprocessing, Models, Backprop, and Thresholding
Context
You built a binary classifier and used a preprocessing pipeline that included PCA and L2 normalization before training. You evaluated models with ROC-type metrics and considered logistic regression alongside alternative baselines. The interviewer wants you to connect preprocessing to variance reduction, explain gradients/backprop in neural nets, and discuss threshold selection and evaluation trade-offs.
Questions
-
Preprocessing
-
Why did you apply PCA and L2 normalization before training? In what order? How does this relate to variance reduction and multicollinearity?
-
Modeling choices
-
Walk through logistic regression as your baseline and compare it to reasonable alternatives you considered (pros/cons, when each shines).
-
Backpropagation
-
Describe backpropagation in modern neural networks at a high level, including how loss gradients are computed and propagated.
-
Knowledge-informed ML and thresholding
-
What is knowledge-informed (knowledge-guided) machine learning? Give concrete ways you used domain knowledge.
-
How did you tune the decision threshold, and under what conditions can (or cannot) both FPR and TPR improve? Describe your process and trade-offs.