PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's ability to implement and reason about load balancing algorithms (round-robin and consistent hashing), covering competencies in data structures, hashing, concurrency, correctness, and unit testing.

  • Medium
  • DoorDash
  • Coding & Algorithms
  • Software Engineer

Implement and compare round-robin and consistent hashing

Company: DoorDash

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

Implement a round-robin load balancer that selects among service nodes, skipping unavailable nodes and wrapping around correctly. Fix typical pitfalls: avoid typos in status constants, ensure the current-index state persists correctly across calls (thread-safe, no unintended resets), and guarantee the algorithm never returns an unavailable node; add comprehensive unit tests to catch these cases. Then implement a consistent-hashing load balancer with virtual nodes; support node additions/removals with minimal key movement and demonstrate how you locate the next available node when the initial target is down. Compare the two approaches in terms of time/space complexity, rebalancing behavior, cache affinity, failure handling, hotspot risk, and typical production use cases.

Quick Answer: This question evaluates a candidate's ability to implement and reason about load balancing algorithms (round-robin and consistent hashing), covering competencies in data structures, hashing, concurrency, correctness, and unit testing.

Part 1: Round-Robin Load Balancer Simulator

Simulate a round-robin load balancer. Each node is a pair (node_id, status). Only the exact status 'UP' means the node is available; every other status must be treated as unavailable. You are also given start_index, which is the index of the last node selected before this simulation begins (-1 means no node has been selected yet). For each incoming request, move forward one position at a time, wrap around at the end, skip unavailable nodes, and return the first available node. If all nodes are unavailable, return None for that request and keep the current index unchanged. Return both the full selection sequence and the final current index.

Constraints

  • 0 <= len(nodes) <= 10^4
  • 0 <= requests <= 10^4
  • start_index is -1, or 0 <= start_index < len(nodes)
  • Node status is a string; only 'UP' counts as available

Examples

Input: ([('A', 'UP'), ('B', 'UP'), ('C', 'UP')], -1, 5)

Expected Output: (['A', 'B', 'C', 'A', 'B'], 1)

Explanation: Normal round-robin over three healthy nodes.

Input: ([('A', 'UP'), ('B', 'DOWN'), ('C', 'UP')], -1, 4)

Expected Output: (['A', 'C', 'A', 'C'], 2)

Explanation: Node B is skipped every time, and the index must persist across picks.

Input: ([('A', 'DOWN'), ('B', 'BROKEN')], -1, 3)

Expected Output: ([None, None, None], -1)

Explanation: Only the exact string 'UP' is available; all requests fail safely.

Input: ([], -1, 2)

Expected Output: ([None, None], -1)

Explanation: Edge case: empty node list.

Solution

def solution(nodes, start_index, requests):
    n = len(nodes)
    current_index = start_index
    selected = []

    for _ in range(requests):
        if n == 0:
            selected.append(None)
            continue

        found = False
        for step in range(1, n + 1):
            idx = (current_index + step) % n
            if nodes[idx][1] == "UP":
                selected.append(nodes[idx][0])
                current_index = idx
                found = True
                break

        if not found:
            selected.append(None)

    return selected, current_index

Time complexity: O(requests * n), where n = len(nodes). Space complexity: O(requests).

Hints

  1. Do not reset the index for every request; it should continue from the last successful selection.
  2. For one request, you never need to inspect more than n nodes. If none are 'UP', append None.

Part 2: Consistent Hashing with Virtual Nodes

Implement a consistent-hashing load balancer with virtual nodes. Use the deterministic hash function hash_value(s) = sum((i + 1) * ord(s[i]) for each character in str(s)). Each physical node node_id owns virtual_nodes positions on the ring: hash_value(f'{node_id}#{replica}') for replica from 0 to virtual_nodes - 1. Sort the ring by (position, node_id, replica). Process operations in order: ('ADD', node_id), ('REMOVE', node_id), and ('GET', key, down_nodes). For a GET, hash the key, find the first virtual node at or after that hash (wrapping around if needed), and if that physical node is down, continue clockwise until you find an available node. If the ring is empty or all active nodes are down, return None. Return the results of all GET operations in order.

Constraints

  • 0 <= len(initial_nodes) <= 10^3
  • 1 <= virtual_nodes <= 100
  • 0 <= len(operations) <= 10^4
  • Operation type is one of 'ADD', 'REMOVE', 'GET'
  • Adding an existing node or removing a missing node should leave the ring unchanged

Examples

Input: (['A', 'B', 'C'], 2, [('GET', 'bZ', []), ('GET', 'dZ', []), ('GET', 'dZ', ['B']), ('REMOVE', 'B'), ('GET', 'dZ', []), ('ADD', 'D'), ('GET', 'iZ', [])])

Expected Output: ['A', 'B', 'C', 'C', 'D']

Explanation: Shows basic lookup, fallback when the initial target is down, removal, and later addition.

Input: ([], 3, [('GET', 'abc', []), ('ADD', 'A'), ('GET', 'abc', ['A']), ('GET', 'abc', [])])

Expected Output: [None, None, 'A']

Explanation: Edge case: empty ring first, then the only node is temporarily down.

Input: (['A', 'B'], 1, [('GET', 'dZ', []), ('REMOVE', 'B'), ('GET', 'dZ', []), ('ADD', 'C'), ('GET', 'eZ', [])])

Expected Output: ['B', 'A', 'C']

Explanation: Removing B reroutes the same key to A; later adding C changes the mapping for another key.

Input: (['A', 'B'], 2, [('GET', 'cZ', ['A', 'B'])])

Expected Output: [None]

Explanation: Edge case: all active physical nodes are down.

Solution

def solution(initial_nodes, virtual_nodes, operations):
    from bisect import bisect_left, insort

    def hash_value(text):
        text = str(text)
        return sum((i + 1) * ord(ch) for i, ch in enumerate(text))

    ring = []
    active = set()

    def add_node(node):
        if node in active:
            return
        active.add(node)
        for replica in range(virtual_nodes):
            insort(ring, (hash_value(f"{node}#{replica}"), node, replica))

    def remove_node(node):
        if node not in active:
            return
        active.remove(node)
        ring[:] = [entry for entry in ring if entry[1] != node]

    for node in initial_nodes:
        add_node(node)

    results = []

    for op in operations:
        kind = op[0]

        if kind == "ADD":
            add_node(op[1])
        elif kind == "REMOVE":
            remove_node(op[1])
        elif kind == "GET":
            key = op[1]
            down_nodes = set(op[2])

            if not ring:
                results.append(None)
                continue

            start = bisect_left(ring, (hash_value(key), "", -1))
            chosen = None

            for offset in range(len(ring)):
                node = ring[(start + offset) % len(ring)][1]
                if node not in down_nodes:
                    chosen = node
                    break

            results.append(chosen)

    return results

Time complexity: For ring size M: ADD is O(virtual_nodes * log M), REMOVE is O(M), and each GET is O(log M + M) in the worst case when many candidate nodes are down. Space complexity: O(M), where M is the number of virtual nodes currently on the ring.

Hints

  1. Keep the ring sorted so you can binary-search the first virtual node whose position is >= the key hash.
  2. If the first matching node is down, walk clockwise through the ring and stop after one full cycle.

Part 3: Choose Between Round-Robin and Consistent Hashing

You are given deployment scenarios and must choose the better load-balancing strategy under a simplified cost model. Each scenario is a tuple (n, v, r, k, c, a, h, f): n = number of physical nodes, v = virtual nodes per physical node, r = routed requests, k = distinct cached keys or sticky sessions, c = node membership changes, a = 1 if cache affinity matters else 0, h = hotspot penalty, f = failure-fallback events. Compute lookup_ch = ceil(log2(n * v)), but use 0 when n * v <= 1. Then compute rr_cost = r + c * k + (r if a == 1 else 0) + f * n. Compute ch_cost = r * lookup_ch + c * ceil(k / n) + n * v + h + f * lookup_ch. For each scenario, return 'round_robin' if rr_cost is smaller, 'consistent_hashing' if ch_cost is smaller, or 'either' if they are equal.

Constraints

  • 0 <= len(scenarios) <= 10^5
  • 1 <= n, v <= 10^6
  • 0 <= r, k, c, h, f <= 10^9
  • a is either 0 or 1
  • Use 64-bit integer arithmetic

Examples

Input: ([(4, 10, 100, 1000, 2, 1, 5, 3)],)

Expected Output: ['consistent_hashing']

Explanation: High churn plus cache affinity favors consistent hashing despite its ring cost.

Input: ([(4, 10, 100, 1000, 0, 0, 0, 0)],)

Expected Output: ['round_robin']

Explanation: No cache affinity and no membership changes make round-robin cheaper.

Input: ([(1, 5, 20, 100, 1, 0, 0, 2), (3, 2, 30, 90, 3, 1, 0, 1)],)

Expected Output: ['round_robin', 'consistent_hashing']

Explanation: The first scenario is simple and cheap for round-robin; the second has enough churn and affinity to favor consistent hashing.

Input: ([(1, 1, 1, 0, 0, 0, 0, 0)],)

Expected Output: ['either']

Explanation: Edge case: with one node and almost no work, the simplified costs tie.

Input: ([],)

Expected Output: []

Explanation: Edge case: no scenarios.

Solution

def solution(scenarios):
    results = []

    for n, v, r, k, c, a, h, f in scenarios:
        ring_size = n * v
        lookup_ch = 0 if ring_size <= 1 else (ring_size - 1).bit_length()

        rr_cost = r + c * k + (r if a == 1 else 0) + f * n
        ch_cost = r * lookup_ch + c * ((k + n - 1) // n) + ring_size + h + f * lookup_ch

        if rr_cost < ch_cost:
            results.append("round_robin")
        elif ch_cost < rr_cost:
            results.append("consistent_hashing")
        else:
            results.append("either")

    return results

Time complexity: O(len(scenarios)). Space complexity: O(len(scenarios)) for the output list.

Hints

  1. You can compute ceil(log2(x)) for x > 0 as (x - 1).bit_length().
  2. Evaluate both formulas exactly for each scenario, then compare and handle the tie case.
Last updated: Jun 4, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Maximize Chef Assignment Profit - DoorDash (medium)
  • Compute Courier Delivery Pay - DoorDash (easy)
  • Compute Nearest Destination Distances - DoorDash (easy)
  • Compute Driver Pay with Double-Pay Windows - DoorDash (medium)
  • Count changed nodes between two menu trees - DoorDash (hard)