PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/Workday

Design AI-Powered Document Search

Last updated: May 1, 2026

Quick Overview

This question evaluates system-design and machine-learning engineering skills for building a scalable AI-enabled document ingestion pipeline, covering OCR, metadata and keyword extraction, indexing, fault-tolerant orchestration, retries, reprocessing, and monitoring.

  • medium
  • Workday
  • ML System Design
  • Software Engineer

Design AI-Powered Document Search

Company: Workday

Role: Software Engineer

Category: ML System Design

Difficulty: medium

Interview Round: Onsite

Design a system where users upload documents and later search them by structured fields and free-text keywords. The system should use a multi-step AI pipeline to extract metadata and keywords before indexing. Requirements: - Support uploads of PDFs and common office documents. - Extract raw text, document fields such as type, vendor, or date, and useful search keywords. - Provide reliable asynchronous processing even when OCR or AI services fail intermittently. - Support fielded queries such as `vendor = Acme AND keyword = renewal`. - Return low-latency search results and highlight matching terms. - Discuss data storage, indexing, orchestration, retries, reprocessing, and monitoring.

Quick Answer: This question evaluates system-design and machine-learning engineering skills for building a scalable AI-enabled document ingestion pipeline, covering OCR, metadata and keyword extraction, indexing, fault-tolerant orchestration, retries, reprocessing, and monitoring.

Workday logo
Workday
Apr 1, 2026, 12:00 AM
Software Engineer
Onsite
ML System Design
1
0
Loading...

Design a system where users upload documents and later search them by structured fields and free-text keywords. The system should use a multi-step AI pipeline to extract metadata and keywords before indexing.

Requirements:

  • Support uploads of PDFs and common office documents.
  • Extract raw text, document fields such as type, vendor, or date, and useful search keywords.
  • Provide reliable asynchronous processing even when OCR or AI services fail intermittently.
  • Support fielded queries such as vendor = Acme AND keyword = renewal .
  • Return low-latency search results and highlight matching terms.
  • Discuss data storage, indexing, orchestration, retries, reprocessing, and monitoring.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Workday•More Software Engineer•Workday Software Engineer•Workday ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.