Explain NLP/RL concepts used in LLM agents

Q: Explain NLP/RL concepts used in LLM agents

This is a Machine Learning interview question from Amazon for Machine Learning Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Loading...

You are interviewing for an applied ML role focused on LLM agents and retrieval-augmented generation (RAG). Answer the following conceptual questions clearly and with examples:

Transformer/NLP foundations

Encoder-only vs encoder–decoder vs decoder-only architectures:
- What are the key differences in objective, attention pattern, and typical use-cases?
- Give representative model families for each.
- For tasks like classification, translation, and open-ended generation, which would you choose and why?
Word2Vec :
- Explain how Word2Vec learns embeddings (CBOW/Skip-gram; negative sampling or hierarchical softmax).
- Contrast static embeddings (e.g., Word2Vec) with contextual embeddings (e.g., Transformer-based). When does each fail?

LLM agents

LLM-as-a-judge / LLM-based evaluation :
- How would you use an LLM to evaluate agent outputs?
- What failure modes (bias, verbosity preference, prompt sensitivity, leakage) and mitigations would you consider?
- What metrics would you report for agent quality (task success, tool-use correctness, groundedness, etc.)?
ReAct :
- Explain how the ReAct paradigm works at a high level.
- Why can interleaving reasoning + actions help compared to pure “think then answer”?
Agent vs LLM :
- What is the fundamental difference between an “agent” and a standalone LLM?
- Name and explain common agent components (e.g., goal, planner, tool interface, memory, policy/executor, evaluator).

Retrieval for RAG

Lexical (sparse) vs dense retrieval :
- Define lexical-based retrieval and dense-based retrieval.
- Compare tradeoffs (latency, interpretability, domain shift, exact match vs semantic match).
- When would you use hybrid retrieval?
BM25 :
- Explain how BM25 scoring works conceptually (TF saturation, IDF, length normalization).
- What are typical knobs/hyperparameters and practical pitfalls?

Reinforcement learning basics (as used around LLMs)

On-policy vs off-policy :
- Define both, and give examples.
- Why does the distinction matter for stability and sample efficiency?
Q-learning :
- What is the Q-function and the Bellman optimality equation?
- Describe the Q-learning update rule and why it is considered off-policy.

Explain NLP/RL concepts used in LLM agents

Transformer/NLP foundations

LLM agents

Retrieval for RAG

Reinforcement learning basics (as used around LLMs)

Solution

Comments (0)