PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/Mistral AI

Design a PDF-to-Markdown Inference API

Last updated: Apr 18, 2026

Quick Overview

This question evaluates a candidate's ability to design scalable, resource-aware ML inference systems that balance API design, page-level parallelism, CPU/GPU/memory scheduling, batching, ordering, intermediate storage, fault tolerance, backpressure, and scaling trade-offs.

  • hard
  • Mistral AI
  • ML System Design
  • Software Engineer

Design a PDF-to-Markdown Inference API

Company: Mistral AI

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design an inference service that converts PDF files to Markdown. You can assume the following building blocks already exist: - A CPU-intensive function that splits a PDF into individual pages and converts each page into a NumPy array - A GPU-intensive OCR engine - A memory-intensive post-processing step that converts OCR outputs into Markdown or assembles final page results Discuss two scenarios: 1. A synchronous API for one very large document, such as a 1000-page PDF, where the user wants the full converted output as quickly as possible 2. An asynchronous API for many concurrent conversion requests, where the client can receive the result later Explain the API contract, page-level parallelism, CPU and GPU scheduling, batching, result ordering, intermediate storage, fault tolerance, backpressure, and how the system should scale.

Quick Answer: This question evaluates a candidate's ability to design scalable, resource-aware ML inference systems that balance API design, page-level parallelism, CPU/GPU/memory scheduling, batching, ordering, intermediate storage, fault tolerance, backpressure, and scaling trade-offs.

Related Interview Questions

  • Build and design a Mistral RAG agent - Mistral AI (hard)
Mistral AI logo
Mistral AI
Apr 16, 2026, 12:00 AM
Software Engineer
Technical Screen
ML System Design
49
0

Design an inference service that converts PDF files to Markdown. You can assume the following building blocks already exist:

  • A CPU-intensive function that splits a PDF into individual pages and converts each page into a NumPy array
  • A GPU-intensive OCR engine
  • A memory-intensive post-processing step that converts OCR outputs into Markdown or assembles final page results

Discuss two scenarios:

  1. A synchronous API for one very large document, such as a 1000-page PDF, where the user wants the full converted output as quickly as possible
  2. An asynchronous API for many concurrent conversion requests, where the client can receive the result later

Explain the API contract, page-level parallelism, CPU and GPU scheduling, batching, result ordering, intermediate storage, fault tolerance, backpressure, and how the system should scale.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Mistral AI•More Software Engineer•Mistral AI Software Engineer•Mistral AI ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.