Implement Python LRU cache with varargs
Company: HubSpot
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to implement an LRU caching mechanism in Python, including decorator design, argument canonicalization for positional args, *args, and **kwargs, handling unhashable inputs, eviction semantics, cache metadata operations (clear, cache_info), and JSON-based persistence and versioning.
Constraints
- 0 <= len(calls) <= 10^4
- 0 <= len(params) <= 50
- maxsize is a positive integer or None (unbounded)
- Argument values may be ints, strings, bools, or nested lists/dicts/sets (possibly unhashable)
- Binding rule: positional args fill params by index; extras form an order-sensitive *args bucket; kwargs override/add by name
Examples
Input: (2, ['a', 'b'], [{'args': [1, 2]}, {'args': [1], 'kwargs': {'b': 2}}])
Expected Output: ['miss', 'hit']
Explanation: Binding positional [1,2] to params [a,b] gives {a:1,b:2}; the second call {a:1, b:2} binds to the same mapping, so it hits.
Input: (2, ['a', 'b'], [{'kwargs': {'b': 2, 'a': 1}}, {'args': [1, 2]}])
Expected Output: ['miss', 'hit']
Explanation: Keyword order does not matter: {b:2,a:1} canonicalizes to the same sorted key as positional [1,2].
Input: (2, ['x'], [{'args': [1]}, {'args': [2]}, {'args': [3]}, {'args': [1]}])
Expected Output: ['miss', 'miss', 'miss', 'miss']
Explanation: With maxsize 2, after caching 1 and 2, inserting 3 evicts 1 (LRU); the later call 1 is therefore a miss.
Input: (2, ['x'], [{'args': [1]}, {'args': [2]}, {'args': [1]}, {'args': [3]}, {'args': [2]}])
Expected Output: ['miss', 'miss', 'hit', 'miss', 'miss']
Explanation: 1,2 cached; 1 hits and becomes MRU; 3 evicts the LRU (2); final 2 is a miss since it was evicted.
Input: (3, ['data'], [{'args': [[1, 2, 3]]}, {'args': [[1, 2, 3]]}, {'args': [{'k': 1}]}, {'args': [{'k': 1}]}])
Expected Output: ['miss', 'hit', 'miss', 'hit']
Explanation: Unhashable list and dict arguments are canonicalized deterministically, so identical structures hit on repeat.
Input: (1, ['x'], [{'args': [1]}, {'args': [2]}, {'args': [1]}])
Expected Output: ['miss', 'miss', 'miss']
Explanation: maxsize 1 holds only the most recent key; every distinct call evicts the prior one, so nothing ever hits here.
Input: (2, ['a'], [])
Expected Output: []
Explanation: No calls yields an empty result.
Input: (None, ['x'], [{'args': [1]}, {'args': [2]}, {'args': [3]}, {'args': [1]}])
Expected Output: ['miss', 'miss', 'miss', 'hit']
Explanation: Unbounded cache (maxsize None) never evicts, so the repeated call 1 hits.
Hints
- Two calls are 'the same' only after you bind positional args to parameter names — normalize every call into a single name->value mapping before hashing.
- Lists, dicts and sets are unhashable. Recursively convert them into nested tuples (sort dict items and set elements) so the key is deterministic regardless of insertion order.
- An OrderedDict gives you O(1) LRU: move_to_end on a hit, and popitem(last=False) to drop the least-recently-used key when you exceed maxsize.
- Tag primitives with their type in the key so values like 1 and True (or 1 and '1') don't accidentally collide.