Maximize total bandwidth of data channel pairs
Company: Amazon
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Take-home Project
You are optimizing how information flows through a network of processing nodes.
There are `n` processing nodes. The bandwidth capability of the *i*-th node is given by an integer array `bandwidth` of length `n`.
There are `streamCount` data channels that must each be connected to **two** processing nodes:
- one node acts as the **main** connection;
- the other acts as the **secondary** connection.
For a single data channel with main node index `i` and secondary node index `j`, its data flow is defined as:
> `dataFlow = bandwidth[i] + bandwidth[j]`
A ***pair of nodes*** `(i, j)` used for one channel must be **unique** among all channels, with the following rules:
- A pair is identified by the ordered indices `(main, secondary)`.
- `(i, j)` and `(j, i)` are considered **different** pairs.
- It is allowed to choose `i == j`, i.e., a channel may use the same node for both main and secondary.
- The same node index can appear in many different pairs (across different channels), as long as the ordered pair `(i, j)` itself is unique for each channel.
Your goal is to select exactly `streamCount` distinct ordered pairs `(main, secondary)` so that the **sum of dataFlow across all data channels is maximized**.
### Input
- An integer `n` (number of nodes).
- An integer array `bandwidth` of length `n`, where `bandwidth[i]` is the bandwidth of node `i`.
- An integer `streamCount` — the number of data channels.
You may assume:
- `1 <= n <= 10^5`.
- `1 <= bandwidth[i] <= 10^9`.
- `1 <= streamCount <= n^2` (since there are `n^2` possible ordered pairs `(i, j)` including `i == j`).
### Output
- Return a single integer: the **maximum possible total dataFlow**, i.e., the maximum possible sum of `bandwidth[main] + bandwidth[secondary]` over all `streamCount` channels.
### Complexity requirement
Design an algorithm that works efficiently for `n` and `streamCount` up to about `10^5` (much better than enumerating all `n^2` pairs explicitly when `n` is large).
Quick Answer: This question evaluates a candidate's ability to design efficient algorithms for combinatorial optimization and selection under constraints, testing understanding of array manipulation, ordered pair counting, and aggregation of bandwidth sums.
You are optimizing how information flows through a network of `n` processing nodes. The bandwidth of node `i` is `bandwidth[i]`.
You must connect `streamCount` data channels. Each channel uses an **ordered pair** of node indices `(main, secondary)` and its data flow is `bandwidth[main] + bandwidth[secondary]`.
Rules for the pairs:
- A pair is identified by ordered indices `(i, j)`, so `(i, j)` and `(j, i)` are different.
- `i == j` is allowed (a channel may use the same node for both ends).
- Every channel must use a **distinct** ordered pair; the same node index may appear in many different pairs as long as no ordered pair repeats.
There are exactly `n^2` possible distinct ordered pairs. Select exactly `streamCount` of them to **maximize the total data flow** (the sum of `bandwidth[main] + bandwidth[secondary]` over all chosen channels), and return that maximum total.
**Constraints**
- `1 <= n <= 10^5`
- `1 <= bandwidth[i] <= 10^9`
- `1 <= streamCount <= n^2`
Design an algorithm much faster than enumerating all `n^2` pairs.
Constraints
- 1 <= n <= 10^5
- 1 <= bandwidth[i] <= 10^9
- 1 <= streamCount <= n^2
- (i, j) and (j, i) are distinct ordered pairs; i == j is allowed
- Every selected channel must use a distinct ordered pair
Examples
Input: (3, [4, 2, 7], 3)
Expected Output: 36
Explanation: Sorted desc [7,4,2]. Top 3 pair-sums: (7,7)=14, then (7,4)=11 and (4,7)=11. Total 14+11+11=36.
Input: (1, [5], 1)
Expected Output: 10
Explanation: Only one node, the single allowed pair is (0,0) with value 5+5=10.
Input: (2, [3, 1], 4)
Expected Output: 16
Explanation: All n^2=4 pairs are taken: (3,3)=6, (3,1)=4, (1,3)=4, (1,1)=2, summing to 16.
Input: (4, [1, 2, 3, 4], 1)
Expected Output: 8
Explanation: The single best ordered pair is (max,max) = 4+4 = 8.
Input: (3, [10, 1, 1], 5)
Expected Output: 64
Explanation: Best pairs: (10,10)=20, (10,1)=11 twice, (1,10)=11 twice -> 20+11*4 = 64.
Input: (5, [9, 9, 9, 9, 9], 25)
Expected Output: 450
Explanation: Every node has bandwidth 9, so all n^2=25 pairs have sum 18; 25*18 = 450.
Input: (4, [8, 6, 4, 2], 7)
Expected Output: 90
Explanation: Greedily taking the 7 highest ordered-pair sums of [8,6,4,2] yields a total of 90.
Hints
- Sort the bandwidths in descending order. A pair's value is bandwidth[i] + bandwidth[j]; the best pairs combine the largest values, but there are up to n^2 of them so you cannot list them all.
- Binary search on a threshold T = the pair-sum cutoff. For a given T you can count, in O(n log n), how many ordered pairs have sum >= T (for each main node i, the secondaries are exactly those j with bandwidth[j] >= T - bandwidth[i]).
- Find the largest T with at least streamCount pairs of sum >= T. Take all pairs with sum >= T+1 in full (track their total with prefix sums), then top up with the leftover channels at value T.