Consistent Hashing Ring with Virtual Nodes for Shard Rebalancing
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
Constraints
- 1 <= vnodes <= 1000 (fixed for the ring's lifetime)
- Up to 10^5 total operations across add_shard, remove_shard, get_shard
- shard_id and key are non-empty ASCII strings up to 64 characters
- The same shard_id is never added twice without being removed first; remove targets a present shard
- All hash arithmetic is unsigned 32-bit (mod 2^32); ring positions lie in [0, 2^32)
Examples
Input: (["ShardRing", "get_shard"], [[3], ["anykey"]])
Expected Output: [None, ""]
Explanation: Empty ring: get_shard has no virtual nodes to route to, so it returns the empty string.
Input: (["ShardRing", "add_shard", "get_shard", "get_shard", "get_shard"], [[5], ["A"], ["k1"], ["k2"], ["user:42"]])
Expected Output: [None, None, "A", "A", "A"]
Explanation: With only shard A on the ring, every key routes clockwise to one of A's 5 virtual nodes, so all keys are owned by A.
Input: (["ShardRing", "add_shard", "add_shard", "get_shard", "get_shard", "get_shard", "get_shard", "get_shard", "get_shard"], [[50], ["A"], ["B"], ["k1"], ["k2"], ["k3"], ["user:42"], ["user:7"], ["alpha"]])
Expected Output: [None, None, None, "B", "B", "B", "A", "B", "B"]
Explanation: A and B each place 50 virtual nodes; keys split by whichever virtual node is nearest clockwise. Here k1/k2/k3, user:7 and alpha land on B's arcs while user:42 lands on A's.
Input: (["ShardRing", "add_shard", "add_shard", "add_shard", "get_shard", "get_shard", "get_shard", "remove_shard", "get_shard", "get_shard", "get_shard"], [[100], ["A"], ["B"], ["C"], ["k1"], ["k2"], ["k3"], ["B"], ["k1"], ["k2"], ["k3"]])
Expected Output: [None, None, None, None, "B", "B", "B", None, "A", "A", "A"]
Explanation: With A, B, C present, k1/k2/k3 are owned by B. Removing B reassigns exactly those keys to the next virtual node clockwise (A here); keys already on A or C would be untouched -- the minimal-movement property.
Input: (["ShardRing", "add_shard", "add_shard", "get_shard", "get_shard", "get_shard", "get_shard"], [[1], ["alpha"], ["beta"], ["k1"], ["k2"], ["k3"], ["key-1"]])
Expected Output: [None, None, None, "alpha", "alpha", "alpha", "beta"]
Explanation: vnodes=1 leaves two positions: beta#0=1965256788 and alpha#0=3950171640. k1/k2/k3 (hashes ~2.5B) fall between them so route clockwise to alpha; key-1 (hash 1474311238) is below beta's position so it routes to beta.
Hints
- Store every virtual node as a (position, shard_id, replica_index) tuple in one list kept sorted; this list IS the ring.
- For get_shard, binary-search for the first entry whose position >= fnv1a_32(key); searching with the 1-tuple (h,) makes same-position ties fall to the smallest (shard_id, replica) automatically.
- If the binary search runs off the end, the key wrapped around: return the shard at index 0 (the smallest position).
- Get the mod-2^32 FNV-1a right by masking with 0xFFFFFFFF (or letting fixed-width unsigned ints wrap); an off-by-one in the multiply constant (16777619) or seed (2166136261) changes every position.