This question evaluates system design and machine learning engineering skills for building a multimodal Retrieval-Augmented Generation (RAG) assistant, covering competencies in data ingestion and preprocessing across modalities, indexing and retrieval strategies, embeddings and re-ranking, grounding/prompting with citations, and evaluation and failure-mode mitigation. It is commonly asked to gauge an engineer's ability to architect scalable, grounded QA pipelines that balance retrieval quality, latency, and cost; it sits in the System Design domain and tests both high-level architectural reasoning and practical implementation considerations.
Design a Retrieval-Augmented Generation (RAG) system that can answer user questions using an internal knowledge base containing multiple modalities (at least text and images; optionally PDFs/tables).
You may make reasonable assumptions and state them clearly.
Login required