Design a Retrieval-Augmented Generation (RAG) architecture to power a Q&A chatbot over vehicle documentation.
Document types & characteristics
-
Sources: engine specification manuals, warranty documents, repair documents, vehicle datasets.
-
Formats: PDFs (1 to 1,000 pages), CSVs (potentially large), HTML, Markdown.
-
Workload:
read-heavy
.
-
Updates: documents are
static
(assume no edits after publishing, but new document versions may be added).
Core requirements
-
Users ask natural-language questions (e.g., “What torque spec for …?”, “Is this covered under warranty?”).
-
System retrieves relevant passages/snippets and generates an answer grounded in the documents.
-
Provide citations (doc + page/section/row reference where possible).
-
Handle both unstructured text (PDF manuals) and structured/semi-structured data (CSV datasets).
Non-functional requirements (discuss and propose targets)
-
Latency and throughput goals for chat requests.
-
Quality: relevance, correctness, citation accuracy.
-
Security: document access control if documents differ by vehicle model, region, or user entitlements.
-
Observability: tracing, retrieval diagnostics.
-
Scalability: growing corpus size.
Deliverables
-
High-level architecture (ingestion/indexing + query-time flow).
-
Data model for documents/chunks/metadata.
-
Vector database choice/strategy and retrieval approach.
-
How to process PDFs/CSVs for chunking and citations.
-
Key tradeoffs and failure modes.