Scenario
You are building a client-facing chatbot that must answer questions grounded in the client's proprietary documents. You must choose how to imbue the system with domain knowledge and ensure high-quality retrieval.
Tasks
-
Decide between fine-tuning a language model vs. Retrieval-Augmented Generation (RAG). Compare on cost, data needs, latency, maintainability, and risk.
-
List and compare fine-tuning approaches: full model fine-tuning, instruction tuning (SFT), LoRA/QLoRA adapters, and embedding-model fine-tuning. Explain when each is appropriate.
-
Explain LoRA's mechanism mathematically and the advantages it offers at inference time.
-
If retrieved documents show low relevance, propose concrete steps to improve retrieval quality.
-
The embedding model is the bottleneck. Describe how you would fine-tune it: data requirements, training objective, negatives, and evaluation.
-
Propose a high-level architecture for a chatbot that must answer across multiple knowledge domains (e.g., product docs, policies, tickets), including routing and evaluation.