PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Choose Between Fine-Tuning and RAG for Client Chatbot

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in applied machine learning for production chatbots, examining model adaptation methods (fine-tuning variants and LoRA), retrieval and embedding optimization, and system architecture for multi-domain knowledge.

  • hard
  • Amazon
  • Machine Learning
  • Data Scientist

Choose Between Fine-Tuning and RAG for Client Chatbot

Company: Amazon

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

##### Scenario Case study: choosing between fine-tuning and RAG for a client chatbot and improving retrieval quality. ##### Question When building an LLM application for a client, how would you decide between fine-tuning and Retrieval-Augmented Generation? List and compare fine-tuning methods such as full tuning, instruction tuning, LoRA and embedding fine-tune. Explain LoRA’s mechanism and its inference-time advantages. If retrieved documents show low relevance, how would you improve retrieval quality? The embedding model is the bottleneck; how would you fine-tune it? What data and training procedure are required? How would you architect a chatbot capable of answering questions across multiple knowledge domains? ##### Hints Compare approaches on cost, data needs, latency; propose iterative retrieval+model tuning and evaluation.

Quick Answer: This question evaluates a data scientist's competency in applied machine learning for production chatbots, examining model adaptation methods (fine-tuning variants and LoRA), retrieval and embedding optimization, and system architecture for multi-domain knowledge.

Related Interview Questions

  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
  • Explain overfitting, regularization, and LLM techniques - Amazon (medium)
Amazon logo
Amazon
Aug 4, 2025, 10:55 AM
Data Scientist
Technical Screen
Machine Learning
6
0

Scenario

You are building a client-facing chatbot that must answer questions grounded in the client's proprietary documents. You must choose how to imbue the system with domain knowledge and ensure high-quality retrieval.

Tasks

  1. Decide between fine-tuning a language model vs. Retrieval-Augmented Generation (RAG). Compare on cost, data needs, latency, maintainability, and risk.
  2. List and compare fine-tuning approaches: full model fine-tuning, instruction tuning (SFT), LoRA/QLoRA adapters, and embedding-model fine-tuning. Explain when each is appropriate.
  3. Explain LoRA's mechanism mathematically and the advantages it offers at inference time.
  4. If retrieved documents show low relevance, propose concrete steps to improve retrieval quality.
  5. The embedding model is the bottleneck. Describe how you would fine-tune it: data requirements, training objective, negatives, and evaluation.
  6. Propose a high-level architecture for a chatbot that must answer across multiple knowledge domains (e.g., product docs, policies, tickets), including routing and evaluation.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.