Schedule Ready Tasks by Deadline
Company: Scale AI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
You are building a workflow scheduler.
Each task has:
- a unique `task_id`
- an integer `deadline`
- an optional list of prerequisite tasks that must be completed before this task becomes ready
Assume the dependency graph is a DAG.
Implement the scheduler in three stages:
1. **Basic version**: Given a list of tasks with no dependencies, return the task with the smallest deadline.
2. **Dependency-aware version**: Tasks may have prerequisites. Repeatedly select the currently ready task with the smallest deadline, mark it as completed, and add any newly unlocked tasks to the ready set. Return a valid execution order.
3. **Optimized version**: Improve the time complexity of selecting the next ready task by using an efficient data structure instead of scanning the full ready list each time. Analyze the time complexity.
If multiple ready tasks have the same deadline, you may break ties by `task_id` or return any consistent order.
Quick Answer: This question evaluates understanding of task scheduling and dependency management in directed acyclic graphs, along with algorithmic selection and data-structure-based optimization for repeated minimum-deadline retrieval.
Part 1: Earliest Deadline Task Without Dependencies
Find the **task with the earliest deadline** from a list of independent tasks (no dependencies).
## What to implement
Implement `solution(tasks)`.
- **Input:** `tasks` — a list of `(task_id, deadline)` pairs (each pair is a 2-tuple of integers).
- **Output:** the `task_id` (an integer) of the task with the **smallest** `deadline`, or `None` if the list is empty.
## Rules
1. Return the `task_id` whose `deadline` is the **smallest**.
2. **Tie-break:** if two or more tasks share the same smallest `deadline`, return the one with the **smaller `task_id`**, so the answer is deterministic.
3. If `tasks` is **empty**, return `None`.
## Examples
- `[(101, 5), (102, 2), (103, 8)]` → `102` — deadline `2` is the smallest.
- `[(3, 4), (1, 4), (2, 6)]` → `1` — tasks `3` and `1` both have deadline `4`; the smaller `task_id` (`1`) wins.
- `[(7, 10)]` → `7` — only one task.
- `[]` → `None` — empty input.
## Constraints
- `0 <= len(tasks) <= 100000`
- `task_id` values are unique integers.
- `deadline` is an integer in the range `[-10^9, 10^9]` (deadlines may be negative).
Constraints
- 0 <= len(tasks) <= 100000
- `task_id` values are unique integers
- `deadline` is an integer in the range [-10^9, 10^9]
Examples
Input: [(101, 5), (102, 2), (103, 8)]
Expected Output: 102
Explanation: Task 102 has the smallest deadline: 2.
Input: [(3, 4), (1, 4), (2, 6)]
Expected Output: 1
Explanation: Tasks 3 and 1 tie on deadline 4, so return the smaller task_id: 1.
Input: [(7, 10)]
Expected Output: 7
Explanation: Only one task is present.
Input: []
Expected Output: None
Explanation: Empty input has no task to return.
Hints
- A full sort is unnecessary; you only need to track the best task seen so far.
- Decide how to handle ties before you start scanning.
Part 2: Schedule Ready Tasks by Deadline
Schedule a set of tasks that have **deadlines** and **prerequisite dependencies**, always running the most urgent task that is currently allowed to run.
## What to implement
Implement a function `solution(tasks)` that returns the order in which the tasks are executed, as a **list of `task_id` values**.
## Input
`tasks` is a list of triples, one per task:
```
(task_id, deadline, prerequisites)
```
- **`task_id`** — a unique integer identifying the task.
- **`deadline`** — an integer; smaller means more urgent.
- **`prerequisites`** — a list of `task_id`s that must be **completed before** this task may run.
The dependency graph is guaranteed to be a **DAG** (no cycles). Every ID that appears in any `prerequisites` list also appears as a task in `tasks`.
## Output
Return a **list of `task_id`s** giving the full execution order (every task appears exactly once).
If `tasks` is empty, return an empty list `[]`.
## Rules
A task is **ready** when all of its prerequisites have already been completed. A task with no prerequisites is ready from the start.
Process tasks one at a time:
1. Look at the set of currently **ready** tasks.
2. Choose the ready task with the **smallest `deadline`**.
3. **Tie-break:** if several ready tasks share the smallest deadline, choose the one with the **smallest `task_id`**.
4. **Complete** that task: append its `task_id` to the output order, then **unlock** any task whose prerequisites are now all completed (those tasks become ready).
5. Repeat until every task has been scheduled.
Use a simple **ready-list** approach: each step, scan the current set of ready tasks to find the next task to run by the rules above.
## Example
```
tasks = [(1, 4, []), (2, 2, []), (3, 3, [1, 2]), (4, 1, [2])]
solution(tasks) -> [2, 4, 1, 3]
```
- Initially ready: tasks `1` (deadline 4) and `2` (deadline 2). Pick **2** (smaller deadline).
- Completing `2` unlocks `4`. Ready: `1` (deadline 4) and `4` (deadline 1). Pick **4**.
- Ready: `1`. Pick **1**, which unlocks `3`.
- Ready: `3`. Pick **3**. Final order: `[2, 4, 1, 3]`.
## Constraints
- `0 <= number of tasks <= 3000`
- `0 <= total number of prerequisite references <= 20000`
- `task_id` values are unique integers.
- Every prerequisite ID appears in the task list.
- The dependency graph is a DAG.
Constraints
- 0 <= number of tasks <= 3000
- 0 <= total number of prerequisite references <= 20000
- `task_id` values are unique integers
- Every prerequisite ID appears in the task list
- The dependency graph is a DAG
Examples
Input: [(1, 4, []), (2, 2, []), (3, 3, [1, 2]), (4, 1, [2])]
Expected Output: [2, 4, 1, 3]
Explanation: Start with ready tasks 1 and 2. Pick 2 first, which unlocks 4. Then pick 4, then 1, then 3.
Input: [(1, 2, []), (2, 2, []), (3, 1, [1]), (4, 5, [2])]
Expected Output: [1, 3, 2, 4]
Explanation: Tasks 1 and 2 tie initially, so choose 1 first. That unlocks task 3, which then has the smallest deadline.
Input: [(5, 7, []), (6, 1, [5]), (7, 2, [6])]
Expected Output: [5, 6, 7]
Explanation: This is a simple dependency chain.
Input: []
Expected Output: []
Explanation: No tasks means no execution order.
Hints
- Track how many prerequisites each task still needs; this is often called an indegree count.
- Keep a list of currently ready tasks, and linearly scan that list to find the next task to run.
Part 3: Optimized Deadline-Aware Task Scheduler
Schedule a set of tasks that have deadlines and prerequisite dependencies, always running the most urgent task that is currently runnable, and return the order in which the tasks execute.
This is the optimized version of the scheduler: instead of scanning the entire list of runnable tasks on every step to find the next one, you must use an efficient data structure (such as a min-heap / priority queue) to retrieve the next task in logarithmic time.
## Function
```
def solution(tasks):
```
## Input
`tasks` is a list of tasks. Each task is a tuple:
```
(task_id, deadline, prerequisites)
```
- **`task_id`** — a unique integer identifying the task.
- **`deadline`** — an integer; smaller means more urgent.
- **`prerequisites`** — a list of `task_id`s that must all be completed before this task can run. An empty list means the task has no prerequisites.
The prerequisite relationships form a **DAG** (directed acyclic graph), so there are no circular dependencies. Every `task_id` referenced as a prerequisite also appears as a task in the list.
## Rules
1. A task is **ready** (runnable) once **all** of its prerequisite tasks have been completed. A task with no prerequisites is ready from the start.
2. At each step, among all currently ready tasks, execute the one with the **smallest `deadline`**. If two ready tasks share the same deadline, execute the one with the **smaller `task_id`**.
3. Completing a task may make other tasks ready (when it was their last unmet prerequisite); those tasks then join the pool of ready tasks for future steps.
4. Continue until all tasks have been executed.
## Output
Return a list of `task_id`s giving the **full execution order** in which the tasks were run.
## Edge cases
- If `tasks` is empty, return an empty list `[]`.
- A single task with no prerequisites simply produces a one-element list containing its `task_id`.
## Example
Given `tasks = [(1, 4, []), (2, 2, []), (3, 3, [1, 2]), (4, 1, [2])]`:
- Initially ready: tasks `1` (deadline 4) and `2` (deadline 2). Run `2` (smaller deadline).
- Completing `2` makes `4` ready (deadline 1). Now ready: `1` (4), `4` (1). Run `4`.
- Run `1` (4). Completing `1` makes `3` ready (its other prerequisite `2` is already done).
- Run `3`.
The execution order is `[2, 4, 1, 3]`.
Constraints
- 0 <= number of tasks <= 100000
- 0 <= total number of prerequisite references <= 200000
- `task_id` values are unique integers
- Every prerequisite ID appears in the task list
- The dependency graph is a DAG
Examples
Input: [(1, 4, []), (2, 2, []), (3, 3, [1, 2]), (4, 1, [2])]
Expected Output: [2, 4, 1, 3]
Explanation: The heap always gives the smallest-deadline ready task.
Input: [(10, 5, []), (11, 1, []), (12, 3, [10]), (13, 2, [11]), (14, 4, [10, 11])]
Expected Output: [11, 13, 10, 12, 14]
Explanation: Task 11 runs first, unlocking 13. Task 10 later unlocks both 12 and 14.
Input: [(1, 1, []), (2, 1, []), (3, 1, [1, 2])]
Expected Output: [1, 2, 3]
Explanation: Tasks 1 and 2 tie initially, so use the smaller task_id first.
Input: []
Expected Output: []
Explanation: Empty input returns an empty schedule.
Input: [(9, 100, [])]
Expected Output: [9]
Explanation: A single task is immediately ready.
Hints
- A min-heap is a natural way to repeatedly retrieve the smallest ready task.
- You still need the same graph bookkeeping as before: indegrees and edges from each prerequisite to the tasks it unlocks.