This question evaluates a candidate's competency in designing end-to-end Retrieval-Augmented Generation (RAG) systems for enterprise document QA, including architecture choices for ingestion and chunking, embedding strategies, vector indexing and ANN, hybrid retrieval and re-ranking, prompt orchestration, safety/PII controls, multilingual support, scalability, observability, API design, and rollout planning. Commonly asked in ML System Design and information retrieval/NLP interviews to assess the ability to reason about trade-offs between recall, latency, cost, and compliance, it tests both high-level architectural judgment (conceptual understanding) and implementation-level production considerations (practical application).
You are designing a Retrieval-Augmented Generation (RAG) system to answer questions over large, evolving enterprise document corpora (policies, specs, tickets, wikis, PDFs, spreadsheets, code snippets). The system must support access controls, multilingual content, and strong safety/PII guarantees.
Specify the end-to-end architecture and key design choices for:
Login required