Scenario
You are asked to design a Retrieval-Augmented Generation (RAG) system that answers user questions using a private corpus (e.g., internal docs, PDFs, knowledge base articles). The interviewer wants you to walk through each component and explain how you would evaluate each step.
Requirements
-
Support natural-language Q&A over private documents.
-
Handle frequent document updates (new/changed docs).
-
Provide citations or traceability to sources.
-
Low latency for interactive use.
-
Reduce hallucinations and ensure answers are grounded in retrieved context.
What to cover
-
End-to-end architecture and data flow.
-
Document ingestion and preprocessing (parsing, cleaning, chunking).
-
Embedding strategy and indexing (vector DB / hybrid search).
-
Retrieval (query understanding, top-k, filters) and optional reranking.
-
Prompting/context assembly and generation.
-
Safety/guardrails and fallback behavior when retrieval is weak.
-
Evaluation plan for:
-
ingestion/chunking quality
-
retrieval quality
-
reranking quality (if used)
-
generation quality and grounding
-
end-to-end user success
-
Online monitoring and continuous improvement loop.