Build and design a Mistral RAG agent

Q: Build and design a Mistral RAG agent

This question evaluates a candidate's skills in building a retrieval-augmented generation (RAG) system, covering document ingestion, chunking, embedding generation, vector indexing, retrieval, LLM API-driven streaming completions, short-term conversation memory, and operational reliability features.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

Design and Implement a Minimal LLM-Powered RAG Agent (Python, Mistral API)

Context

You are asked to build a minimal, but production-minded, retrieval-augmented generation (RAG) agent in Python that uses the Mistral API for embeddings and chat completions. The agent should work over a small local corpus (e.g., Markdown and PDF files), support streaming responses, maintain a short conversation memory, and include basic reliability features.

Assume Python 3.10+ and that external dependencies can be installed. The Mistral API token is provided via an environment variable.

Deliverables

Implement a minimal Python tool that:

Ingests a small local corpus (Markdown/PDF), chunks text, generates embeddings, and builds an index.
Retrieves top-k passages for a user query.
Composes a prompt and invokes Mistral chat/completions with streaming.
Keeps short-term conversation memory.
Reads the API token from an environment variable.
Handles errors, retries, and rate limits.
Includes a brief README and smoke tests.

Then, explain your system design choices, covering:

Chunking strategy
Embedding model choice
Vector index selection
Latency and cost budgeting
Caching
Prompt templating
Safety/PII handling
Logging/monitoring
Offline RAG evaluation
Fallbacks when retrieval fails

Build and design a Mistral RAG agent

Overview

Design and Implement a Minimal LLM-Powered RAG Agent (Python, Mistral API)

Context

Deliverables

Comments (0)