PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Sonatus

Design a RAG system for vehicle documents

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design a Retrieval-Augmented Generation system, testing competencies in information retrieval, vector search, document ingestion and chunking, metadata modeling, citation handling, security, observability, and scalability within the system design and NLP/information-retrieval domain.

  • hard
  • Sonatus
  • System Design
  • Software Engineer

Design a RAG system for vehicle documents

Company: Sonatus

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design a Retrieval-Augmented Generation (RAG) architecture to power a Q&A chatbot over **vehicle documentation**. ### Document types & characteristics - Sources: engine specification manuals, warranty documents, repair documents, vehicle datasets. - Formats: PDFs (1 to 1,000 pages), CSVs (potentially large), HTML, Markdown. - Workload: **read-heavy**. - Updates: documents are **static** (assume no edits after publishing, but new document versions may be added). ### Core requirements 1. Users ask natural-language questions (e.g., “What torque spec for …?”, “Is this covered under warranty?”). 2. System retrieves relevant passages/snippets and generates an answer grounded in the documents. 3. Provide citations (doc + page/section/row reference where possible). 4. Handle both unstructured text (PDF manuals) and structured/semi-structured data (CSV datasets). ### Non-functional requirements (discuss and propose targets) - Latency and throughput goals for chat requests. - Quality: relevance, correctness, citation accuracy. - Security: document access control if documents differ by vehicle model, region, or user entitlements. - Observability: tracing, retrieval diagnostics. - Scalability: growing corpus size. ### Deliverables - High-level architecture (ingestion/indexing + query-time flow). - Data model for documents/chunks/metadata. - Vector database choice/strategy and retrieval approach. - How to process PDFs/CSVs for chunking and citations. - Key tradeoffs and failure modes.

Quick Answer: This question evaluates a candidate's ability to design a Retrieval-Augmented Generation system, testing competencies in information retrieval, vector search, document ingestion and chunking, metadata modeling, citation handling, security, observability, and scalability within the system design and NLP/information-retrieval domain.

Sonatus logo
Sonatus
Feb 12, 2026, 12:00 AM
Software Engineer
Technical Screen
System Design
5
0

Design a Retrieval-Augmented Generation (RAG) architecture to power a Q&A chatbot over vehicle documentation.

Document types & characteristics

  • Sources: engine specification manuals, warranty documents, repair documents, vehicle datasets.
  • Formats: PDFs (1 to 1,000 pages), CSVs (potentially large), HTML, Markdown.
  • Workload: read-heavy .
  • Updates: documents are static (assume no edits after publishing, but new document versions may be added).

Core requirements

  1. Users ask natural-language questions (e.g., “What torque spec for …?”, “Is this covered under warranty?”).
  2. System retrieves relevant passages/snippets and generates an answer grounded in the documents.
  3. Provide citations (doc + page/section/row reference where possible).
  4. Handle both unstructured text (PDF manuals) and structured/semi-structured data (CSV datasets).

Non-functional requirements (discuss and propose targets)

  • Latency and throughput goals for chat requests.
  • Quality: relevance, correctness, citation accuracy.
  • Security: document access control if documents differ by vehicle model, region, or user entitlements.
  • Observability: tracing, retrieval diagnostics.
  • Scalability: growing corpus size.

Deliverables

  • High-level architecture (ingestion/indexing + query-time flow).
  • Data model for documents/chunks/metadata.
  • Vector database choice/strategy and retrieval approach.
  • How to process PDFs/CSVs for chunking and citations.
  • Key tradeoffs and failure modes.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Sonatus•More Software Engineer•Sonatus Software Engineer•Sonatus System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.