Optimize Job Routing in Parallel Machine Scheduling
Company: TikTok
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
##### Scenario
In the Production Factory game, jobs with varying processing times arrive and must be routed through two parallel machines to minimize total completion time.
##### Question
Design an algorithm that decides on-the-fly which machine each incoming job should occupy to keep average flow-time minimal.
##### Hints
Think greedy (shortest-processing-time first) and priority queues; discuss time-complexity.
Quick Answer: This question evaluates understanding of online scheduling and algorithmic decision-making, specifically skills in minimizing average flow-time when routing jobs to parallel machines.
You are given n jobs with positive processing times. There are two identical, non-preemptive parallel machines. All jobs are available at time 0. The completion time of a job is when it finishes on its machine. Minimize average flow-time, which is equivalent to minimizing the sum of completion times. Compute the minimal total completion time by scheduling jobs greedily: always assign the next shortest job to the machine that becomes available earliest (break ties by choosing any). Return this minimal total completion time.
Constraints
- 1 <= n <= 2 * 10^5
- times[i] is an integer in [1, 10^9]
- Two identical, non-preemptive machines
- All jobs are available at time 0
- Return the minimal total completion time (sum over all jobs)
Solution
from typing import List
import heapq
def minimal_total_completion_time(times: List[int]) -> int:
# Two identical machines
if not times:
return 0
times_sorted = sorted(times)
# Min-heap of machine next-available times (start both at 0)
avail = [0, 0]
total = 0
heappop = heapq.heappop
heappush = heapq.heappush
for p in times_sorted:
t = heappop(avail)
finish = t + p
total += finish
heappush(avail, finish)
return total
Explanation
Sorting by processing time (SPT) and always assigning the next shortest job to the machine that becomes available earliest yields an optimal schedule for two identical machines to minimize the sum of completion times. We simulate this with a min-heap of machine availability times. For each job p in ascending order, pop the earliest available time t, set its completion to t+p, add to the total, and push t+p back. The heap operations are O(log 2) = O(1), so the dominant cost is sorting.
Time complexity: O(n log n). Space complexity: O(n) for the sorted copy (heap is O(1)).