PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/OpenAI

Design a Retrieval-Augmented Generation (RAG) system

Last updated: May 12, 2026

Quick Overview

This question evaluates a candidate's ability to design production-grade Retrieval-Augmented Generation systems, testing competencies in information retrieval, embedding and indexing strategies, LLM integration, scalability, access control, and observability within the ML system design domain.

  • hard
  • OpenAI
  • ML System Design
  • Software Engineer

Design a Retrieval-Augmented Generation (RAG) system

Company: OpenAI

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

## Prompt Design a **Retrieval-Augmented Generation (RAG)** system that answers user questions using an organization’s internal documents (PDFs, wiki pages, tickets, and policies) while minimizing hallucinations. ## Requirements - **Inputs**: user natural-language query; a continuously updated document corpus. - **Outputs**: a grounded answer with **citations** (snippets + document links/IDs). - **Quality goals**: - High answer correctness and groundedness. - Handle ambiguous questions by asking clarifying questions when needed. - **System goals**: - Low latency (interactive). - Scalable to millions of documents. - Support frequent document updates (new/edited/deleted docs). - Security: enforce **document-level access control** (per user/role) and prevent data leakage. - Observability: logging, monitoring, evaluation, and iterative improvement. ## What to cover Explain the end-to-end architecture including: - Ingestion + preprocessing (chunking, metadata, dedup). - Embedding generation and indexing. - Retrieval (vector + keyword), reranking, and context construction. - LLM prompting and citation generation. - Caching, rate limiting, and fallbacks. - Offline/online evaluation and A/B testing. - Failure modes and mitigations (hallucinations, stale data, prompt injection).

Quick Answer: This question evaluates a candidate's ability to design production-grade Retrieval-Augmented Generation systems, testing competencies in information retrieval, embedding and indexing strategies, LLM integration, scalability, access control, and observability within the ML system design domain.

Related Interview Questions

  • Design a GPU-Efficient Video Service - OpenAI (medium)
  • How would you build an image classifier with dirty data? - OpenAI (easy)
  • Design a RAG system with evaluation - OpenAI (medium)
  • Design an AWS fine-tuning platform for LLMs - OpenAI (hard)
  • Design a chatbot fallback for unknown questions - OpenAI (hard)
OpenAI logo
OpenAI
Dec 15, 2025, 12:00 AM
Software Engineer
Technical Screen
ML System Design
4
0
Loading...

Prompt

Design a Retrieval-Augmented Generation (RAG) system that answers user questions using an organization’s internal documents (PDFs, wiki pages, tickets, and policies) while minimizing hallucinations.

Requirements

  • Inputs : user natural-language query; a continuously updated document corpus.
  • Outputs : a grounded answer with citations (snippets + document links/IDs).
  • Quality goals :
    • High answer correctness and groundedness.
    • Handle ambiguous questions by asking clarifying questions when needed.
  • System goals :
    • Low latency (interactive).
    • Scalable to millions of documents.
    • Support frequent document updates (new/edited/deleted docs).
    • Security: enforce document-level access control (per user/role) and prevent data leakage.
    • Observability: logging, monitoring, evaluation, and iterative improvement.

What to cover

Explain the end-to-end architecture including:

  • Ingestion + preprocessing (chunking, metadata, dedup).
  • Embedding generation and indexing.
  • Retrieval (vector + keyword), reranking, and context construction.
  • LLM prompting and citation generation.
  • Caching, rate limiting, and fallbacks.
  • Offline/online evaluation and A/B testing.
  • Failure modes and mitigations (hallucinations, stale data, prompt injection).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.