Explain why LLMs produce hallucinations
Company: Zillow
Role: Machine Learning Engineer
Category: Machine Learning
Difficulty: medium
Interview Round: Technical Screen
Large language models (LLMs) are known to "hallucinate"—that is, they sometimes produce fluent, confident answers that are factually incorrect or unsupported by any source.
Explain **why** LLMs hallucinate. In your answer, cover:
- How the standard training objective and data characteristics lead to hallucinations.
- Model- and optimization-related reasons (e.g., limitations of next-token prediction, exposure bias, lack of grounding).
- Inference-time factors such as decoding strategies, prompts, and distribution shift.
- (Briefly) a few practical techniques used in industry to **reduce** hallucinations, even if they cannot be eliminated entirely.
Quick Answer: This question evaluates a candidate's understanding of the causes and mitigation of large language model hallucinations, covering competencies in probabilistic training objectives, data characteristics, model and optimization limitations, inference-time behavior, and awareness of mitigation strategies.