PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Scale AI

Design a Streaming Job Scheduler

Last updated: Apr 26, 2026

Quick Overview

This question evaluates understanding of distributed system design, streaming task scheduling, DAG dependency management, concurrency control, fault tolerance, API and storage modeling, and operational monitoring for high-throughput workloads.

  • easy
  • Scale AI
  • System Design
  • Software Engineer

Design a Streaming Job Scheduler

Company: Scale AI

Role: Software Engineer

Category: System Design

Difficulty: easy

Interview Round: Technical Screen

Design a task scheduling service that continuously ingests tasks and dispatches them to workers. Each task has: - `id`: unique task identifier - `deadline`: execution deadline - `prerequisites`: a list of task IDs that must complete first The system should: 1. Accept tasks in batches or as a stream. 2. Validate task definitions, including malformed input, duplicate IDs, and invalid prerequisite references. 3. Verify that task dependencies form a directed acyclic graph. 4. Dispatch only runnable tasks, where all prerequisites are completed. 5. Among runnable tasks, prefer the one with the earliest deadline. 6. Support multiple workers consuming tasks concurrently without assigning the same task twice. 7. Handle worker crashes, retries, and recovery. 8. Scale to large task volumes and provide useful monitoring. Discuss the API design, storage model, scheduling logic, DAG validation strategy, concurrency control, fault tolerance, and scaling approach.

Quick Answer: This question evaluates understanding of distributed system design, streaming task scheduling, DAG dependency management, concurrency control, fault tolerance, API and storage modeling, and operational monitoring for high-throughput workloads.

Related Interview Questions

  • Design a large-scale ticketing system - Scale AI (medium)
Scale AI logo
Scale AI
Apr 9, 2026, 12:00 AM
Software Engineer
Technical Screen
System Design
138
0

Design a task scheduling service that continuously ingests tasks and dispatches them to workers.

Each task has:

  • id : unique task identifier
  • deadline : execution deadline
  • prerequisites : a list of task IDs that must complete first

The system should:

  1. Accept tasks in batches or as a stream.
  2. Validate task definitions, including malformed input, duplicate IDs, and invalid prerequisite references.
  3. Verify that task dependencies form a directed acyclic graph.
  4. Dispatch only runnable tasks, where all prerequisites are completed.
  5. Among runnable tasks, prefer the one with the earliest deadline.
  6. Support multiple workers consuming tasks concurrently without assigning the same task twice.
  7. Handle worker crashes, retries, and recovery.
  8. Scale to large task volumes and provide useful monitoring.

Discuss the API design, storage model, scheduling logic, DAG validation strategy, concurrency control, fault tolerance, and scaling approach.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Scale AI•More Software Engineer•Scale AI Software Engineer•Scale AI System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.