Explain NLP/RL concepts used in LLM agents
Company: Amazon
Role: Machine Learning Engineer
Category: Machine Learning
Difficulty: hard
Interview Round: Onsite
You are interviewing for an applied ML role focused on LLM agents and retrieval-augmented generation (RAG). Answer the following conceptual questions clearly and with examples:
## Transformer/NLP foundations
1. **Encoder-only vs encoder–decoder vs decoder-only** architectures:
- What are the key differences in objective, attention pattern, and typical use-cases?
- Give representative model families for each.
- For tasks like classification, translation, and open-ended generation, which would you choose and why?
2. **Word2Vec**:
- Explain how Word2Vec learns embeddings (CBOW/Skip-gram; negative sampling or hierarchical softmax).
- Contrast **static embeddings** (e.g., Word2Vec) with **contextual embeddings** (e.g., Transformer-based). When does each fail?
## LLM agents
3. **LLM-as-a-judge / LLM-based evaluation**:
- How would you use an LLM to evaluate agent outputs?
- What failure modes (bias, verbosity preference, prompt sensitivity, leakage) and mitigations would you consider?
- What metrics would you report for agent quality (task success, tool-use correctness, groundedness, etc.)?
4. **ReAct**:
- Explain how the ReAct paradigm works at a high level.
- Why can interleaving reasoning + actions help compared to pure “think then answer”?
5. **Agent vs LLM**:
- What is the fundamental difference between an “agent” and a standalone LLM?
- Name and explain common **agent components** (e.g., goal, planner, tool interface, memory, policy/executor, evaluator).
## Retrieval for RAG
6. **Lexical (sparse) vs dense retrieval**:
- Define lexical-based retrieval and dense-based retrieval.
- Compare tradeoffs (latency, interpretability, domain shift, exact match vs semantic match).
- When would you use hybrid retrieval?
7. **BM25**:
- Explain how BM25 scoring works conceptually (TF saturation, IDF, length normalization).
- What are typical knobs/hyperparameters and practical pitfalls?
## Reinforcement learning basics (as used around LLMs)
8. **On-policy vs off-policy**:
- Define both, and give examples.
- Why does the distinction matter for stability and sample efficiency?
9. **Q-learning**:
- What is the Q-function and the Bellman optimality equation?
- Describe the Q-learning update rule and why it is considered off-policy.
Quick Answer: This question evaluates proficiency in transformer-based NLP, embedding methods, LLM agent architecture and evaluation, retrieval techniques for RAG, and reinforcement learning fundamentals, testing understanding of model families, static vs contextual embeddings, agent components and metrics, lexical vs dense retrieval, BM25 concepts, and on/off-policy Q-learning. It is commonly asked to assess an applied Machine Learning engineer's ability to reason about trade-offs and design choices across Machine Learning, Natural Language Processing, Information Retrieval, and Reinforcement Learning, emphasizing both conceptual understanding and practical application.