PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Meta

Design image and multimodal generation systems

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's competence in designing end-to-end image and multimodal generation systems, covering data collection and curation, model architecture and conditioning choices, training objectives, safety and content filtering, evaluation metrics, deployment, monitoring, and critical analysis of relevant research.

  • hard
  • Meta
  • ML System Design
  • Machine Learning Engineer

Design image and multimodal generation systems

Company: Meta

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design an image generation system end to end: cover data collection and curation (sources, licensing, deduplication, filtering, captioning), model architecture choices (e.g., diffusion vs. autoregressive; conditioning and resolution scaling), training objectives and losses, compute/throughput planning, safety and content filtering, evaluation metrics (quality, diversity, bias), inference optimization and deployment (caching, batching, quantization, distillation), cost controls, and monitoring. Then extend the design to a multimodal text-and-image generation system that can accept and produce both modalities. Discuss multimodal data collection and alignment, architectures for cross-modal fusion, training strategies (pretraining, instruction tuning, RLHF/RLAIF), knowledge updating, retrieval augmentation, and product constraints (latency targets, guardrails, feedback loops). Be prepared to walk through a specific recent paper relevant to your design: explain its key idea, experimental setup, metrics, trade-offs, and how you would adapt or productionize it.

Quick Answer: This question evaluates a candidate's competence in designing end-to-end image and multimodal generation systems, covering data collection and curation, model architecture and conditioning choices, training objectives, safety and content filtering, evaluation metrics, deployment, monitoring, and critical analysis of relevant research.

Related Interview Questions

  • Design an Automated Ticket Investigation Agent - Meta (hard)
  • Prevent Private Code Leakage in Coding Agents - Meta (medium)
  • Design Place Recommendation System - Meta (medium)
  • Design a Code Review Agent - Meta (medium)
  • Design a Short-Video Recommendation System - Meta (medium)
Meta logo
Meta
Aug 11, 2025, 12:00 AM
Machine Learning Engineer
Technical Screen
ML System Design
4
0

System Design: Image Generation and Multimodal Generation

Part 1 — End-to-End Image Generation System

Design an end-to-end image generation system. Cover the following:

  1. Data collection and curation
    • Sources and licensing strategy
    • Deduplication and near-duplicate removal
    • Content filtering (NSFW, violence, watermarks, PII)
    • Captioning/annotations and multilingual support
  2. Model architecture choices
    • Diffusion vs. autoregressive (AR) vs. hybrid
    • Conditioning (text, style, ControlNet-like signals) and resolution scaling
  3. Training objectives and losses
  4. Compute and throughput planning
  5. Safety and content filtering (pre-, in-, and post-training)
  6. Evaluation metrics (quality, diversity, prompt adherence, bias/fairness)
  7. Inference optimization and deployment
    • Caching, batching, quantization, distillation/acceleration
  8. Cost controls (tiers, rate limits, autoscaling)
  9. Monitoring and observability

Part 2 — Extend to Multimodal Text-and-Image Generation

Extend the design to a system that can accept and produce both modalities (text and images). Address:

  1. Multimodal data collection and alignment
  2. Architectures for cross-modal fusion
  3. Training strategies (pretraining, instruction tuning, RLHF/RLAIF)
  4. Knowledge updating and retrieval augmentation
  5. Product constraints (latency targets, guardrails, feedback loops)

Paper Deep-Dive

Pick a recent, relevant paper and walk through:

  • Key idea and architecture
  • Experimental setup and datasets
  • Metrics and results
  • Trade-offs and limitations
  • How you would adapt or productionize the approach in a real system

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Meta•More Machine Learning Engineer•Meta Machine Learning Engineer•Meta ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.