LLM Architecture, Positional Embeddings, Fine-Tuning (PEFT), Regularization, and Evaluation
Context
You are interviewing for a Machine Learning Engineer role with a focus on large language models. Provide a concise but technically solid overview with production-minded trade-offs.
Tasks
-
Compare Transformer architectures:
-
Encoder–decoder vs. decoder-only. When is each used, and why?
-
Detail the decoder-only Transformer stack:
-
Self-attention (causal), feed-forward/MLP blocks, normalization, and residual connections; note common design choices.
-
Explain positional embeddings:
-
Absolute (sinusoidal vs. learned), relative, and rotary (RoPE). Discuss trade-offs (length generalization, efficiency, stability).
-
Describe hands-on fine-tuning using PEFT:
-
Methods such as LoRA, adapters, prefix/prompt tuning, IA3, QLoRA. When to choose which, with practical considerations.
-
Preventing overfitting during training:
-
Regularization (e.g., dropout, weight decay), data augmentation, early stopping, and other tactics.
-
Evaluation strategy:
-
Intrinsic vs. extrinsic metrics. Offline vs. online validation, including how to run safe and informative experiments.