This question evaluates a candidate's skills in building a retrieval-augmented generation (RAG) system, covering document ingestion, chunking, embedding generation, vector indexing, retrieval, LLM API-driven streaming completions, short-term conversation memory, and operational reliability features.
You are asked to build a minimal, but production-minded, retrieval-augmented generation (RAG) agent in Python that uses the Mistral API for embeddings and chat completions. The agent should work over a small local corpus (e.g., Markdown and PDF files), support streaming responses, maintain a short conversation memory, and include basic reliability features.
Assume Python 3.10+ and that external dependencies can be installed. The Mistral API token is provided via an environment variable.
Implement a minimal Python tool that:
Then, explain your system design choices, covering:
Login required