This question evaluates an engineer's ability to design end-to-end Retrieval-Augmented Generation (RAG) search systems for enterprise settings, testing competencies in ML system design, information retrieval (dense/sparse/hybrid), vector and sparse indexing, data ingestion and enrichment, LLM selection and grounding, security and compliance, scalability, and observability. It is commonly asked to assess architectural reasoning and trade-off analysis for production ML services—examining how candidates balance latency, freshness, multi-tenancy isolation, and operational concerns—and it belongs to the ML System Design domain, requiring both high-level conceptual understanding and practical application-level design detail.
You are tasked with designing a Retrieval-Augmented Generation (RAG) search system for enterprise users. The system should allow employees to ask natural-language questions and receive grounded, cited answers using their organization’s private documents and tools.
Assume a multi-tenant, cloud-hosted setup with strict security and compliance requirements. Content spans PDFs, Office docs, wikis, tickets, chats, and databases. Scale assumptions (adjust as needed):
Design the system and cover the following:
Include key trade-offs and minimal diagrams-in-words (a clear component-by-component description is sufficient).
Login required