Build a Mistral-powered RAG agent

Q: Build a Mistral-powered RAG agent

This is a ML System Design interview question from Meta for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

Build a Minimal RAG Tool Using the Mistral API

Context

You have an API token and need to implement a small retrieval-augmented generation (RAG) tool in Python that can answer questions over a local folder of Markdown and PDF files using the Mistral API. The tool should support both a CLI and an HTTP server.

Requirements

Implement document ingestion, chunking, and an in-memory vector index for retrieval.
Provide a CLI with commands:
- index <path>
- ask <question>
- serve (HTTP server exposing a /chat endpoint)
Call chat/completions with streaming; include the top-k retrieved chunks in the prompt and return source citations.
Add exponential backoff/retries for 429 and timeouts, plus structured error handling.
Configure via environment variables for API key, model names, and ports.
Include a README with setup steps and minimal tests.
Briefly explain your retrieval algorithm choices and a quick way to evaluate answer quality.

Assumptions

Language: Python 3.10+.
Use the Mistral HTTP API directly to avoid client-library version mismatch.
You may persist the built index to disk so that ask and serve can reuse it across processes, while the core index data structure remains in-memory when serving queries.
Supported file types: .md and .pdf.

Build a Mistral-powered RAG agent

Build a Minimal RAG Tool Using the Mistral API

Context

Requirements

Assumptions

Solution (Locked)

Comments (0)