PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Netflix

Explain self-attention, LoRA, Adam vs SGD, ViT

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of modern Machine Learning/Deep Learning topics, including self-attention mechanics (queries, keys, values and scaled logits), Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning and memory savings, optimizer behavior (Adam versus SGD with momentum), and architectural trade-offs between Vision Transformers and CNNs including patch-size considerations. It is categorized under Machine Learning and is commonly asked because it probes both conceptual understanding and practical application—testing reasoning about training dynamics, model scaling, fine-tuning strategies, and resource/performance trade-offs.

  • medium
  • Netflix
  • Machine Learning
  • Machine Learning Engineer

Explain self-attention, LoRA, Adam vs SGD, ViT

Company: Netflix

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

Answer the following ML/Deep Learning interview questions: 1) **Describe self-attention** in Transformer models. What are the queries, keys, and values, and how is the attention output computed? 2) **Why are attention logits divided by \(\sqrt{d_k}\)** (where \(d_k\) is the key/query dimension) before the softmax? 3) **Describe LoRA (Low-Rank Adaptation)** for fine-tuning large models. How does it modify the weight update during fine-tuning, and what are its main benefits? 4) **Why does LoRA often reduce GPU memory consumption** compared to full fine-tuning? 5) **What is the difference between Adam and SGD** (including SGD with momentum)? When might you prefer one over the other? 6) **Compare Vision Transformers (ViT) and CNNs**. What are the main pros and cons of each? 7) **What factors influence the choice of ViT patch size** (e.g., 8×8 vs 16×16 vs 32×32), and what are the trade-offs?

Quick Answer: This question evaluates understanding of modern Machine Learning/Deep Learning topics, including self-attention mechanics (queries, keys, values and scaled logits), Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning and memory savings, optimizer behavior (Adam versus SGD with momentum), and architectural trade-offs between Vision Transformers and CNNs including patch-size considerations. It is categorized under Machine Learning and is commonly asked because it probes both conceptual understanding and practical application—testing reasoning about training dynamics, model scaling, fine-tuning strategies, and resource/performance trade-offs.

Related Interview Questions

  • Compare Losses and Explain LoRA - Netflix (medium)
  • Design a robust conversion propensity model - Netflix (hard)
  • Explain tokenization and Transformer variants - Netflix (medium)
  • Design Real-Time Fraud Detection with XGBoost Model - Netflix (medium)
  • Address Fraud Detection with Imbalance and Concept Drift Solutions - Netflix (medium)
Netflix logo
Netflix
Feb 23, 2026, 12:00 AM
Machine Learning Engineer
Technical Screen
Machine Learning
6
0

Answer the following ML/Deep Learning interview questions:

  1. Describe self-attention in Transformer models. What are the queries, keys, and values, and how is the attention output computed?
  2. Why are attention logits divided by dk\sqrt{d_k}dk​​ (where dkd_kdk​ is the key/query dimension) before the softmax?
  3. Describe LoRA (Low-Rank Adaptation) for fine-tuning large models. How does it modify the weight update during fine-tuning, and what are its main benefits?
  4. Why does LoRA often reduce GPU memory consumption compared to full fine-tuning?
  5. What is the difference between Adam and SGD (including SGD with momentum)? When might you prefer one over the other?
  6. Compare Vision Transformers (ViT) and CNNs . What are the main pros and cons of each?
  7. What factors influence the choice of ViT patch size (e.g., 8×8 vs 16×16 vs 32×32), and what are the trade-offs?

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Netflix•More Machine Learning Engineer•Netflix Machine Learning Engineer•Netflix Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.