Implement and compare round-robin and consistent hashing
Company: DoorDash
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to implement and reason about load balancing algorithms (round-robin and consistent hashing), covering competencies in data structures, hashing, concurrency, correctness, and unit testing.
Part 1: Round-Robin Load Balancer Simulator
Constraints
- 0 <= len(nodes) <= 10^4
- 0 <= requests <= 10^4
- start_index is -1, or 0 <= start_index < len(nodes)
- Node status is a string; only 'UP' counts as available
Examples
Input: ([('A', 'UP'), ('B', 'UP'), ('C', 'UP')], -1, 5)
Expected Output: (['A', 'B', 'C', 'A', 'B'], 1)
Explanation: Normal round-robin over three healthy nodes.
Input: ([('A', 'UP'), ('B', 'DOWN'), ('C', 'UP')], -1, 4)
Expected Output: (['A', 'C', 'A', 'C'], 2)
Explanation: Node B is skipped every time, and the index must persist across picks.
Input: ([('A', 'DOWN'), ('B', 'BROKEN')], -1, 3)
Expected Output: ([None, None, None], -1)
Explanation: Only the exact string 'UP' is available; all requests fail safely.
Input: ([], -1, 2)
Expected Output: ([None, None], -1)
Explanation: Edge case: empty node list.
Solution
def solution(nodes, start_index, requests):
n = len(nodes)
current_index = start_index
selected = []
for _ in range(requests):
if n == 0:
selected.append(None)
continue
found = False
for step in range(1, n + 1):
idx = (current_index + step) % n
if nodes[idx][1] == "UP":
selected.append(nodes[idx][0])
current_index = idx
found = True
break
if not found:
selected.append(None)
return selected, current_indexTime complexity: O(requests * n), where n = len(nodes). Space complexity: O(requests).
Hints
- Do not reset the index for every request; it should continue from the last successful selection.
- For one request, you never need to inspect more than n nodes. If none are 'UP', append None.
Part 2: Consistent Hashing with Virtual Nodes
Constraints
- 0 <= len(initial_nodes) <= 10^3
- 1 <= virtual_nodes <= 100
- 0 <= len(operations) <= 10^4
- Operation type is one of 'ADD', 'REMOVE', 'GET'
- Adding an existing node or removing a missing node should leave the ring unchanged
Examples
Input: (['A', 'B', 'C'], 2, [('GET', 'bZ', []), ('GET', 'dZ', []), ('GET', 'dZ', ['B']), ('REMOVE', 'B'), ('GET', 'dZ', []), ('ADD', 'D'), ('GET', 'iZ', [])])
Expected Output: ['A', 'B', 'C', 'C', 'D']
Explanation: Shows basic lookup, fallback when the initial target is down, removal, and later addition.
Input: ([], 3, [('GET', 'abc', []), ('ADD', 'A'), ('GET', 'abc', ['A']), ('GET', 'abc', [])])
Expected Output: [None, None, 'A']
Explanation: Edge case: empty ring first, then the only node is temporarily down.
Input: (['A', 'B'], 1, [('GET', 'dZ', []), ('REMOVE', 'B'), ('GET', 'dZ', []), ('ADD', 'C'), ('GET', 'eZ', [])])
Expected Output: ['B', 'A', 'C']
Explanation: Removing B reroutes the same key to A; later adding C changes the mapping for another key.
Input: (['A', 'B'], 2, [('GET', 'cZ', ['A', 'B'])])
Expected Output: [None]
Explanation: Edge case: all active physical nodes are down.
Solution
def solution(initial_nodes, virtual_nodes, operations):
from bisect import bisect_left, insort
def hash_value(text):
text = str(text)
return sum((i + 1) * ord(ch) for i, ch in enumerate(text))
ring = []
active = set()
def add_node(node):
if node in active:
return
active.add(node)
for replica in range(virtual_nodes):
insort(ring, (hash_value(f"{node}#{replica}"), node, replica))
def remove_node(node):
if node not in active:
return
active.remove(node)
ring[:] = [entry for entry in ring if entry[1] != node]
for node in initial_nodes:
add_node(node)
results = []
for op in operations:
kind = op[0]
if kind == "ADD":
add_node(op[1])
elif kind == "REMOVE":
remove_node(op[1])
elif kind == "GET":
key = op[1]
down_nodes = set(op[2])
if not ring:
results.append(None)
continue
start = bisect_left(ring, (hash_value(key), "", -1))
chosen = None
for offset in range(len(ring)):
node = ring[(start + offset) % len(ring)][1]
if node not in down_nodes:
chosen = node
break
results.append(chosen)
return resultsTime complexity: For ring size M: ADD is O(virtual_nodes * log M), REMOVE is O(M), and each GET is O(log M + M) in the worst case when many candidate nodes are down. Space complexity: O(M), where M is the number of virtual nodes currently on the ring.
Hints
- Keep the ring sorted so you can binary-search the first virtual node whose position is >= the key hash.
- If the first matching node is down, walk clockwise through the ring and stop after one full cycle.
Part 3: Choose Between Round-Robin and Consistent Hashing
Constraints
- 0 <= len(scenarios) <= 10^5
- 1 <= n, v <= 10^6
- 0 <= r, k, c, h, f <= 10^9
- a is either 0 or 1
- Use 64-bit integer arithmetic
Examples
Input: ([(4, 10, 100, 1000, 2, 1, 5, 3)],)
Expected Output: ['consistent_hashing']
Explanation: High churn plus cache affinity favors consistent hashing despite its ring cost.
Input: ([(4, 10, 100, 1000, 0, 0, 0, 0)],)
Expected Output: ['round_robin']
Explanation: No cache affinity and no membership changes make round-robin cheaper.
Input: ([(1, 5, 20, 100, 1, 0, 0, 2), (3, 2, 30, 90, 3, 1, 0, 1)],)
Expected Output: ['round_robin', 'consistent_hashing']
Explanation: The first scenario is simple and cheap for round-robin; the second has enough churn and affinity to favor consistent hashing.
Input: ([(1, 1, 1, 0, 0, 0, 0, 0)],)
Expected Output: ['either']
Explanation: Edge case: with one node and almost no work, the simplified costs tie.
Input: ([],)
Expected Output: []
Explanation: Edge case: no scenarios.
Solution
def solution(scenarios):
results = []
for n, v, r, k, c, a, h, f in scenarios:
ring_size = n * v
lookup_ch = 0 if ring_size <= 1 else (ring_size - 1).bit_length()
rr_cost = r + c * k + (r if a == 1 else 0) + f * n
ch_cost = r * lookup_ch + c * ((k + n - 1) // n) + ring_size + h + f * lookup_ch
if rr_cost < ch_cost:
results.append("round_robin")
elif ch_cost < rr_cost:
results.append("consistent_hashing")
else:
results.append("either")
return resultsTime complexity: O(len(scenarios)). Space complexity: O(len(scenarios)) for the output list.
Hints
- You can compute ceil(log2(x)) for x > 0 as (x - 1).bit_length().
- Evaluate both formulas exactly for each scenario, then compare and handle the tie case.