{"blocks": [{"key": "717b9b3a", "text": "Scenario", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "aec90048", "text": "Case study: choosing between fine-tuning and RAG for a client chatbot and improving retrieval quality.", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "4fabe82c", "text": "Question", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "5d1f192d", "text": "When building an LLM application for a client, how would you decide between fine-tuning and Retrieval-Augmented Generation? List and compare fine-tuning methods such as full tuning, instruction tuning, LoRA and embedding fine-tune. Explain LoRA’s mechanism and its inference-time advantages. If retrieved documents show low relevance, how would you improve retrieval quality? The embedding model is the bottleneck; how would you fine-tune it? What data and training procedure are required? How would you architect a chatbot capable of answering questions across multiple knowledge domains?", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "5c45e180", "text": "Hints", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "fba45334", "text": "Compare approaches on cost, data needs, latency; propose iterative retrieval+model tuning and evaluation.", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}], "entityMap": {}}