PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/ML System Design/Adobe

Design a multimodal embedding service

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's competency in ML system design for building scalable, multi‑tenant, privacy‑sensitive multimodal embedding pipelines, covering ingestion and idempotency, modality-specific preprocessing, model selection and fusion, throughput engineering, storage and versioning, data hygiene, and operational concerns.

  • hard
  • Adobe
  • ML System Design
  • Software Engineer

Design a multimodal embedding service

Company: Adobe

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design a system to compute embeddings for user‑uploaded files across modalities—documents, images, and videos—where each file size is at most x MB, and persist results to a database. Describe the ingestion API, validation and preprocessing (e.g., text chunking, image resizing, video frame sampling or clip extraction), model choices per modality, batching, GPU/accelerator scheduling, and concurrency controls. Explain how you will store embeddings and metadata (e.g., vector store vs. relational/columnar DB), support similarity search, deduplicate near‑identical content, handle retries and idempotency, and manage backfills when models are updated. Include monitoring, quality evaluation, cost controls, and privacy/security considerations.

Quick Answer: This question evaluates a candidate's competency in ML system design for building scalable, multi‑tenant, privacy‑sensitive multimodal embedding pipelines, covering ingestion and idempotency, modality-specific preprocessing, model selection and fusion, throughput engineering, storage and versioning, data hygiene, and operational concerns.

Related Interview Questions

  • Design a natural-language AEP Q&A assistant - Adobe (hard)
  • Design file-embedding storage system - Adobe (hard)
Adobe logo
Adobe
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
ML System Design
4
0

System Design: Multimodal Embedding Pipeline for Documents, Images, and Videos

You are designing a production service that computes embeddings for user‑uploaded files across modalities—documents, images, and videos—and persists results for search and analytics.

Assume:

  • Each file is at most x MB (a configurable limit enforced at the API).
  • Processing is asynchronous with eventual consistency (embeddings become available after ingestion completes).
  • Multi‑tenant, privacy‑sensitive environment.

Requirements

  1. Ingestion API
    • Endpoints to submit files or URLs, specify modality and options, and poll status.
    • Idempotency, retries, and concurrency controls.
  2. Validation and Preprocessing
    • Documents: text extraction, language detection, tokenization, chunking with overlap, boilerplate removal.
    • Images: format normalization, orientation, resizing/cropping, optional multi‑crop/tiling.
    • Videos: frame sampling or clip extraction, optional ASR for audio track, keyframe/shot detection.
  3. Model Choices per Modality
    • Text/document embedding model.
    • Image embedding model.
    • Video embedding model or aggregation of frame embeddings; optional fusion with ASR text.
    • Consider a single cross‑modal space vs. per‑modality spaces.
  4. Throughput Engineering
    • Batching, GPU/accelerator scheduling, and backpressure.
    • Worker pools for CPU preprocessing vs. GPU inference.
  5. Storage Design
    • How to store embeddings and metadata (vector store vs. relational/columnar DB).
    • Schema, versioning, and multi‑tenancy.
    • Support similarity search and metadata filters.
  6. Data Hygiene and Robustness
    • Deduplicate near‑identical content (files, chunks, frames).
    • Retries, idempotency, and exactly‑once/write‑once semantics.
    • Backfills when models are updated (index rebuilds, blue/green swaps).
  7. Operations
    • Monitoring, alerting, and tracing.
    • Quality evaluation (offline metrics, canaries, A/B tests).
    • Cost controls (batching, quantization, index compression, TTLs).
    • Privacy/security (encryption, access control, retention policies).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Adobe•More Software Engineer•Adobe Software Engineer•Adobe ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.