PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Amazon

Deep-dive your GenAI project architecture

Last updated: Mar 29, 2026

Quick Overview

This question evaluates expertise in end-to-end GenAI system architecture, covering competencies in problem definition, data sourcing and governance, model choice and fine-tuning, evaluation and safety guardrails, latency/cost engineering, and scaling, and it belongs to the ML System Design domain.

  • hard
  • Amazon
  • ML System Design
  • Machine Learning Engineer

Deep-dive your GenAI project architecture

Company: Amazon

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Onsite

Walk me through a GenAI system you built end-to-end. Describe the problem, data sourcing and governance (size, quality, privacy), model choice (e.g., encoder–decoder, instruction-tuned LLM, or RAG), training/fine-tuning setup (objectives, hyperparameters, scaling), evaluation (offline metrics and human eval), safety/guardrails (toxicity, jailbreaks, hallucination mitigation), latency/throughput and cost constraints, and key failure modes. What trade-offs did you make, and how would you evolve the system for 10x traffic while meeting a 200 ms p95 latency SLO and a 20% cost reduction?

Quick Answer: This question evaluates expertise in end-to-end GenAI system architecture, covering competencies in problem definition, data sourcing and governance, model choice and fine-tuning, evaluation and safety guardrails, latency/cost engineering, and scaling, and it belongs to the ML System Design domain.

Related Interview Questions

  • Design systems for global request detection and labeling - Amazon (hard)
  • Design a computer-use agent end-to-end - Amazon (medium)
  • Debug online worse than offline model performance - Amazon (medium)
  • Approach an ambiguous business problem - Amazon (medium)
  • Explain parallelism and collectives in training - Amazon (medium)
Amazon logo
Amazon
Jul 17, 2025, 12:00 AM
Machine Learning Engineer
Onsite
ML System Design
2
0

GenAI System Deep-Dive: End-to-End Design and Scale Strategy

Provide a structured walkthrough of a production-grade GenAI system you built end-to-end. Cover the following areas:

1) Problem Definition

  • What user problem did you solve and for whom?
  • What were the success criteria and constraints (e.g., latency SLOs, cost per request, compliance)?

2) Data Sourcing and Governance

  • Sources and modalities (structured/unstructured, internal/external).
  • Size and quality (volume, coverage, freshness, label quality, dedup/OCR issues).
  • Privacy, PII handling, access control, consent, retention, residency.

3) Model Choice and Architecture

  • Rationale for encoder–decoder, instruction-tuned LLM, RAG, or hybrid.
  • Orchestration: query routing, tools, vector retrieval, rerankers, and any function calling.

4) Training and Fine-Tuning

  • Objectives (SFT, preference optimization like DPO/KTO, contrastive for embeddings).
  • Datasets, augmentation/synthetic data, and curriculum.
  • Hyperparameters, scaling strategy, and infra.

5) Evaluation

  • Offline metrics (retrieval quality, faithfulness, toxicity, hallucination rate).
  • Human evaluation protocol and acceptance criteria.
  • Online A/B or interleaving.

6) Safety and Guardrails

  • Toxicity, jailbreak, prompt injection, PII leakage mitigation.
  • Policy enforcement and red-teaming.

7) Latency, Throughput, and Cost

  • End-to-end latency budget and p95 targets.
  • Throughput and concurrency limits.
  • Cost per request and major cost drivers.

8) Key Failure Modes

  • Where it breaks (data, retrieval, reasoning, safety) and mitigations.

9) Trade-offs

  • What you optimized for and what you deferred.

10) 10x Scale Plan

  • How you would evolve the system for 10x traffic while meeting a 200 ms p95 latency SLO and a 20% cost reduction.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.