Design a chunking strategy for RAG

Q: Design a chunking strategy for RAG

This is a ML System Design interview question from Zillow for Machine Learning Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

You are building a Retrieval-Augmented Generation (RAG) system that uses an LLM plus a vector database. Before creating embeddings and indexing documents, you must split long documents into chunks.

Describe how you would design the chunking strategy. In your answer, discuss:

How you would choose chunk size and overlap and the trade-offs involved (recall vs. context size, latency, etc.).
How you would use document structure (e.g., headings, paragraphs, sections) vs. naive fixed-length splits.
When you might use more advanced methods like semantic chunking or dynamic chunk sizes.
How you would evaluate and iterate on your chunking strategy in a real system.

Design a chunking strategy for RAG

Solution

Comments (0)