Explain Chunking for Financial RAG
Company: Morgan Stanley
Role: Data Scientist
Category: Machine Learning
Difficulty: medium
Interview Round: HR Screen
Suppose you are building a retrieval-augmented generation (RAG) assistant over long financial research reports, filings, and policy documents. Explain:
1. What **chunking** is and why it matters.
2. The difference between **fixed-size chunking**, **semantic chunking**, and **parent-child chunking**.
3. How parent-child chunking works in practice, where retrieval happens on smaller child chunks but the LLM receives a larger parent span.
4. When parent-child chunking is preferable to simpler chunking strategies.
5. The tradeoffs among retrieval recall, retrieval precision, latency, token cost, answer faithfulness, and citation quality.
6. How you would evaluate the design both offline and online.
Assume the corpus contains long narrative sections, tables, and hierarchical headings, and users ask both broad summary questions and precise citation-heavy questions.
Quick Answer: This question evaluates understanding of chunking strategies within retrieval-augmented generation for long, domain-specific financial documents, testing competencies in information retrieval, document representation, trade-off analysis (recall, precision, latency, token cost, faithfulness, citations), and evaluation methodology.