This question evaluates a candidate's understanding of core machine learning theory and practical modeling techniques, including parallelism in gradient-boosted trees (XGBoost), layer normalization in Transformer layers, multimodal neural network design and fusion strategies, collaborative filtering approaches, multi-armed bandit problem formulation and algorithms, and the probabilistic derivation and interpretation of logistic regression. It is commonly asked in technical interviews to assess breadth and depth across scalability, neural architecture and normalization choices, recommendation and online decision-making methods, and statistical modeling and regularization, and it falls within the Machine Learning domain testing both conceptual understanding and practical application.
You are asked to answer concisely but with depth across the following topics:
Explain how XGBoost achieves parallelism during training. State what can and cannot be parallelized and why.
Explain layer normalization in a Transformer block, including where it is applied (pre-LN vs post-LN), the formula, and why it is used instead of batch normalization.
Describe a general architecture for a multimodal neural network (e.g., text + image, or tabular + text). Include common fusion strategies and how to handle missing modalities.
Explain how collaborative filtering works, contrasting memory-based and model-based approaches. Provide the core formulas and how predictions are made.
Formulate the K-armed bandit problem and present at least two solution algorithms (e.g., UCB, Thompson Sampling). Show a small numeric example and discuss regret.
Derive logistic regression from a probabilistic viewpoint, provide the log-likelihood and gradient, and interpret coefficients. Mention regularization and decision boundaries.
Login required