Design an enterprise RAG assistant for internal docs
Company: OpenAI
Role: Software Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Technical Screen
## Scenario
Design an **enterprise GPT-style assistant** that allows employees to ask questions about **internal company documents** (policies, wikis, specs, tickets, PDFs, etc.). The core approach is **Retrieval-Augmented Generation (RAG)**.
The interviewer is primarily focused on **machine learning choices and training** rather than generic infrastructure.
## Requirements
1. Propose an end-to-end RAG system and explicitly break it into components:
- **Retriever** (candidate generation)
- **Evaluator** (reranker / verifier / filter)
- **Generator** (LLM answering with citations)
2. For each component, discuss:
- Model architecture choices (and why)
- Training objective / loss functions
- Optimizer and training recipe (batching, negatives, schedules, mixed precision, etc.)
- Training data preparation (labeling strategies, weak supervision, synthetic data, privacy constraints)
- Evaluation strategy (offline metrics + human eval + online/production monitoring)
3. Address common RAG failure modes (hallucination, stale content, conflicting docs, long documents) and how your modeling/training/evaluation handles them.
Assume the system must respect **document-level permissions**, and responses should be **grounded** in retrieved sources with citations.
Quick Answer: This question evaluates expertise in designing and training Retrieval-Augmented Generation (RAG) systems, including retriever, evaluator (reranker/verifier/filter), and generator components, with emphasis on model architecture choices, training objectives, data preparation under privacy and document-permission constraints, and evaluation strategies for grounded answers with citations. It is commonly asked to probe advanced ML system design and operationalization skills for mitigating hallucination, stale or conflicting sources, and long-document retrieval; the category is ML System Design and the level is practical application-focused with detailed modeling and training considerations.