This question evaluates understanding of modern Machine Learning/Deep Learning topics, including self-attention mechanics (queries, keys, values and scaled logits), Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning and memory savings, optimizer behavior (Adam versus SGD with momentum), and architectural trade-offs between Vision Transformers and CNNs including patch-size considerations. It is categorized under Machine Learning and is commonly asked because it probes both conceptual understanding and practical application—testing reasoning about training dynamics, model scaling, fine-tuning strategies, and resource/performance trade-offs.
Answer the following ML/Deep Learning interview questions: