Design high-throughput hashing for kernels
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Design a high-throughput hash-based lookup to be called inside a tight kernel. Choose between open addressing and chaining, specify the load factor, probe sequence, and table layout to favor vectorization and contiguous accesses. Show how to use bitwise operations, prefetching, and alignment to reduce collisions and cache misses. Explain trade-offs and how you would profile and validate the design.
Quick Answer: This question evaluates understanding of cache-efficient, high-throughput hash lookup design and related competencies such as hashing strategies, probe behavior, memory layout, vectorization, bitwise optimizations, prefetching, alignment, and microarchitectural performance trade-offs.
You are modeling the core of a hash-based lookup that will be called inside a tight kernel. To favor contiguous memory accesses, the design uses open addressing with linear probing instead of chaining.
Implement the exact hash table behavior below.
1. Let n be the number of input pairs.
2. Choose the table capacity as the smallest power of two greater than or equal to ceil(10 * n / 7), with a minimum capacity of 1.
3. Store entries in a contiguous table using open addressing.
4. Use this hash for the initial slot:
initial_index = ((key * 2654435761) & 0xFFFFFFFF) & (capacity - 1)
5. On collision, probe linearly: next_index = (index + 1) & (capacity - 1)
6. Insert pairs in order. If a key appears more than once, overwrite the previous value in place.
7. For lookups, return the stored value, or -1 if the key is not found.
A probe is counted every time you examine one table slot during insertion or lookup.
Return a tuple:
(capacity, answers, insert_probes, lookup_probes)
where answers is the list of lookup results for the query keys.
This problem focuses on simulating a kernel-friendly table layout. In a lower-level language, the same design would naturally support alignment, prefetching, and contiguous accesses.
Constraints
- 0 <= len(pairs), len(queries) <= 2 * 10^5
- Keys are 32-bit signed integers
- Values are integers in the range [0, 10^9]
- Capacity is the smallest power of two >= ceil(10 * len(pairs) / 7), with minimum 1
Examples
Input: ([(1, 10), (9, 20), (17, 30)], [1, 17, 2])
Expected Output: (8, [10, 30, -1], 6, 7)
Explanation: Capacity is 8. Keys 1, 9, and 17 all start at slot 1 and are placed at slots 1, 2, and 3. Insert probes = 1 + 2 + 3 = 6. Lookups use the same probe sequence, giving results [10, 30, -1] and lookup probes 1 + 3 + 3 = 7.