Discuss Transformer LLM Design | Nvidia Machine Learning Engineer Interview

{"blocks": [{"key": "bd68ac9f", "text": "Question", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "601336e5", "text": "Explain the architecture of a Transformer-based large language model (LLM). How does self-attention enable long-range dependency modeling? Describe how you would fine-tune a pretrained LLM on a domain-specific corpus while avoiding catastrophic forgetting. How would you evaluate, monitor, and mitigate hallucinations in an LLM that serves user queries in production?", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}], "entityMap": {}}