Scenario
Design an enterprise GPT-style assistant that allows employees to ask questions about internal company documents (policies, wikis, specs, tickets, PDFs, etc.). The core approach is Retrieval-Augmented Generation (RAG).
The interviewer is primarily focused on machine learning choices and training rather than generic infrastructure.
Requirements
-
Propose an end-to-end RAG system and explicitly break it into components:
-
Retriever
(candidate generation)
-
Evaluator
(reranker / verifier / filter)
-
Generator
(LLM answering with citations)
-
For each component, discuss:
-
Model architecture choices (and why)
-
Training objective / loss functions
-
Optimizer and training recipe (batching, negatives, schedules, mixed precision, etc.)
-
Training data preparation (labeling strategies, weak supervision, synthetic data, privacy constraints)
-
Evaluation strategy (offline metrics + human eval + online/production monitoring)
-
Address common RAG failure modes (hallucination, stale content, conflicting docs, long documents) and how your modeling/training/evaluation handles them.
Assume the system must respect document-level permissions, and responses should be grounded in retrieved sources with citations.