You are interviewing for a Machine Learning Engineer role. Discuss the following machine-learning topics in a structured way:
-
Describe one practical implementation of a bag-of-words text feature pipeline. Include tokenization, vocabulary construction, handling rare or unseen words, sparse storage, and weighting choices such as raw counts or TF-IDF.
-
Explain out-of-bag (OOB) evaluation in ensemble methods such as bagging or random forests. How are OOB samples formed, and how can they be used for validation?
-
Suppose you need to build a binary classification model to predict click-through rate (CTR). Explain the full workflow from problem definition to deployment, including data collection, feature engineering, model selection, training, calibration, and evaluation.
-
More generally, if asked to build a classification model from scratch, walk through every major step and mention appropriate techniques or model choices at each stage.
-
If the model's online performance drops after deployment, how would you investigate and debug the issue? Cover model, data, serving, and product-level causes.