Group GPUs by node and partition by links
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of graph algorithms and combinatorial optimization, specifically skills in graph traversal, grouping by node attributes, and partitioning to maximize intra-group edges.
Part 1: Group GPUs by nodeId
Constraints
- 0 <= len(gpus) <= 200000
- Each `gpuId` is unique
- `nodeId` values may repeat
- 0 <= len(links) <= 200000
- Every GPU ID appearing in `links` is guaranteed to exist in `gpus`
Examples
Input: ([(1, 10), (2, 10), (3, 20), (4, 20), (5, 30)], [(1, 2), (2, 3), (4, 5)])
Expected Output: {10: [1, 2], 20: [3, 4], 30: [5]}
Explanation: GPUs are grouped only by `nodeId`, not by connectivity.
Input: ([(7, 2), (3, 1), (5, 2), (4, 1)], [(7, 5)])
Expected Output: {1: [3, 4], 2: [5, 7]}
Explanation: The input is unsorted, so each group's GPU IDs should be sorted before returning.
Input: ([(42, 99)], [])
Expected Output: {99: [42]}
Explanation: Single-GPU edge case.
Input: ([], [])
Expected Output: {}
Explanation: Empty input should return an empty dictionary.
Hints
- A hash map keyed by `nodeId` is enough to collect GPU IDs for each machine.
- Sort each group's GPU IDs before returning so the output is deterministic.
Part 2: Partition GPUs into exactly X groups to maximize internal links
Constraints
- 1 <= len(gpus) <= 15
- 1 <= x <= len(gpus)
- 0 <= len(links) <= n * (n - 1) / 2
- Every GPU ID in `links` appears in `gpus`
- Links are undirected; duplicate links and self-loops should be ignored
Examples
Input: ([1, 2, 3, 4, 5, 6], [(1, 2), (1, 3), (2, 3), (4, 5), (4, 6), (5, 6)], 2)
Expected Output: [[1, 2, 3], [4, 5, 6]]
Explanation: Each triangle should stay together, giving 3 + 3 = 6 internal links.
Input: ([1, 2, 3, 4], [(1, 2), (2, 3), (3, 4)], 2)
Expected Output: [[1], [2, 3, 4]]
Explanation: Several partitions achieve 2 internal links; the required tie-breaker picks the lexicographically smallest one.
Input: ([10, 20, 30], [], 2)
Expected Output: [[10], [20, 30]]
Explanation: With no links, every partition has score 0, so the lexicographically smallest valid partition is returned.
Input: ([7], [], 1)
Expected Output: [[7]]
Explanation: Single-GPU edge case.
Input: ([1, 2, 3], [(1, 2), (1, 3), (2, 3)], 3)
Expected Output: [[1], [2], [3]]
Explanation: Boundary case where `x = n`: every GPU must be alone.
Hints
- Since `n` is small, think in terms of subsets represented by bitmasks.
- To avoid counting the same partition in different orders, always force the smallest remaining GPU into the next group you build.