PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Explain LLM fundamentals and trade-offs

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's understanding of large language model (LLM) fundamentals and engineering trade-offs across subword tokenization, self-attention and complexity mitigation, pretraining versus instruction tuning and RLHF/DPO, retrieval-augmented generation with indexing and embedding choices, adaptation methods, inference optimizations, and evaluation and safety considerations within the Machine Learning/NLP domain. It is commonly asked to assess architectural reasoning about performance, latency, retrieval and data-design trade-offs for a Machine Learning Engineer, testing both conceptual understanding and practical application.

  • hard
  • Amazon
  • Machine Learning
  • Machine Learning Engineer

Explain LLM fundamentals and trade-offs

Company: Amazon

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

Answer the following LLM fundamentals: How does subword tokenization (e.g., BPE) work and why is it used? Explain self-attention and the O(n^ 2) cost; discuss techniques to reduce it (e.g., sparsity, sliding windows, KV cache). Contrast pretraining, instruction tuning, and RLHF/DPO. Describe a RAG architecture, including indexing choices (BM25 vs dense, chunking, embeddings) and how retrieval quality affects generation. When do you use prompting vs fine-tuning vs adapters? Explain quantization, KV caching, and batching for low-latency inference. How do you evaluate LLMs (e.g., task-specific metrics, human eval) and mitigate hallucinations and safety risks?

Quick Answer: This question evaluates a candidate's understanding of large language model (LLM) fundamentals and engineering trade-offs across subword tokenization, self-attention and complexity mitigation, pretraining versus instruction tuning and RLHF/DPO, retrieval-augmented generation with indexing and embedding choices, adaptation methods, inference optimizations, and evaluation and safety considerations within the Machine Learning/NLP domain. It is commonly asked to assess architectural reasoning about performance, latency, retrieval and data-design trade-offs for a Machine Learning Engineer, testing both conceptual understanding and practical application.

Related Interview Questions

  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
  • Explain overfitting, regularization, and LLM techniques - Amazon (medium)
Amazon logo
Amazon
Jul 17, 2025, 12:00 AM
Machine Learning Engineer
Onsite
Machine Learning
5
0

LLM Fundamentals — Onsite Interview Task

Context: Assume a modern transformer-based LLM. Provide precise, concise explanations with examples and trade-offs.

  1. Subword tokenization (e.g., BPE): How does it work and why is it used?
  2. Self-attention: Explain the mechanism and its O(n^2) cost. Discuss techniques to reduce it (e.g., sparsity, sliding windows, KV cache).
  3. Contrast pretraining, instruction tuning, and RLHF/DPO.
  4. Describe a RAG architecture. Compare indexing choices (BM25 vs dense), chunking strategies, and embeddings. Explain how retrieval quality affects generation.
  5. When do you use prompting vs fine-tuning vs adapters (e.g., LoRA)?
  6. Low-latency inference: Explain quantization, KV caching, and batching.
  7. How do you evaluate LLMs (task-specific metrics, human eval) and mitigate hallucinations and safety risks?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.