You are given a set of GPU tasks. Each task is a tuple (start, end, node_id) where:
-
start
and
end
are timestamps (e.g., integers, with
start < end
).
-
The task runs on a single GPU identified by
node_id
.
-
Multiple tasks may overlap in time (possibly on different nodes).
Define the GPU fleet idle time as any time interval during which no tasks are running on any node (i.e., across the entire fleet, the number of active tasks is 0).
Part 1 (batch)
Given an unsorted list of tasks, return all maximal idle time intervals as a list of disjoint intervals (idle_start, idle_end).
Clarifications/assumptions to make explicit in your solution:
-
You should infer the overall time range from the tasks themselves (i.e., consider idle gaps only
between
the earliest task start and the latest task end).
-
If tasks touch (e.g., one ends at
t
and another starts at
t
), there is
no
idle time at
t
.
Part 2 (streaming)
Now tasks arrive one at a time in a stream. Design an efficient approach/data structure that supports:
-
addTask(start, end, node_id)
-
getIdleIntervals()
→ returns the current set of maximal idle intervals based on all tasks seen so far
Discuss expected time/space complexity per operation. (You may assume timestamps are integers and tasks may arrive out of order.)