Design pipeline using classification and embedding services

Q: Design pipeline using classification and embedding services

This is a ML System Design interview question from Scale AI for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

You are given two black-box ML services:

Classification Service
- Input: One or more text documents.
- Output: A label for each document (e.g., topic or category).
Embedding Service
- Input: One or more text documents.
- Output: A vector embedding (e.g., 768-dim float vector) for each document.

You need to design a system that:

Accepts file uploads from users (each file contains one or more text documents).
Supports both single-file and bulk upload (up to 1,000 files in one request).
For each document:
- Computes a classification label using the classification service.
- Computes an embedding using the embedding service.
Stores results so they can be queried later (e.g., by user, file, or semantic search).
Satisfies both:
- Low latency for small/single uploads.
- High throughput for large/bulk uploads.

Task

Design the end-to-end pipeline and APIs. Specifically address:

API Design
- How clients upload files (single and bulk up to 1,000 files).
- What responses they receive (synchronous vs asynchronous).
Architecture
- How you orchestrate calls to the classification and embedding services.
- How you store raw files, parsed text, labels, and embeddings.
- How you achieve both low latency and high throughput.
Scalability & Performance
- How to handle 1,000-file uploads without running out of memory or violating latency goals.
- Batching, queuing, and concurrency strategies when talking to the ML services.
Reliability & Observability
- Error handling for partial failures (e.g., some files fail to process).
- Monitoring, logging, and metrics.

Assume you cannot change the internals of the classification and embedding services; you may only call their APIs.

Design pipeline using classification and embedding services

Solution

Comments (0)