Design and optimize a RAG system
Company: OpenAI
Role: Machine Learning Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Onsite
## Scenario
You are building a Retrieval-Augmented Generation (RAG) system for question answering over an internal document corpus.
## Task
Design the end-to-end architecture and describe optimization strategies.
## Requirements
- Ingest documents continuously (new/updated docs).
- High answer quality with citations.
- Low latency for interactive use.
- Handle long documents and heterogeneous formats (PDF/HTML/wiki pages).
## Deliverables
- Components (ingestion, chunking, embeddings, index, retriever, reranker, generator).
- How you improve relevance and reduce hallucinations.
- Evaluation plan (offline + online) and monitoring for drift.
Quick Answer: This question evaluates knowledge and competency in designing and optimizing Retrieval-Augmented Generation (RAG) systems, including components like ingestion, chunking, embeddings, indexing, retrieval, reranking, generation, and evaluation.