PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCareers

Quick Overview

This question evaluates a candidate's ability to design and implement an LRU cache in Python that supports variable-length positional and keyword arguments with canonical, deterministic hashing for semantically equivalent calls and O(1) get/put via a hash map plus a doubly linked list; it falls under the Coding & Algorithms domain and primarily tests practical implementation skills, API robustness, hashing/equality subtleties, and algorithmic complexity guarantees. The persistence follow-up evaluates understanding of serialization and storage design — on-disk formats, versioning, and atomic write strategies — and compares trade-offs between binary pickle and JSON-based serialization in terms of security, compatibility, and performance, reflecting both conceptual design and practical system-level considerations.

  • Medium
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement Python LRU cache with args and persistence

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

Implement an LRU cache in Python as a decorator or class that correctly supports variable-length positional arguments and keyword arguments. Ensure that calls are keyed canonically so keyword ordering does not matter, unhashable arguments are converted to deterministic, hashable representations, and semantically equivalent calls map to the same cache entry. Provide O( 1) get/put using a hash map plus a doubly linked list. Follow-up: add persistence so the cache can be snapshotted to disk and restored on process start. Specify an on-disk format, versioning, and atomic write strategy; compare trade-offs between pickle and a JSON-based manual serialization (security, compatibility, performance); and implement a simple, safe approach.

Quick Answer: This question evaluates a candidate's ability to design and implement an LRU cache in Python that supports variable-length positional and keyword arguments with canonical, deterministic hashing for semantically equivalent calls and O(1) get/put via a hash map plus a doubly linked list; it falls under the Coding & Algorithms domain and primarily tests practical implementation skills, API robustness, hashing/equality subtleties, and algorithmic complexity guarantees. The persistence follow-up evaluates understanding of serialization and storage design — on-disk formats, versioning, and atomic write strategies — and compares trade-offs between binary pickle and JSON-based serialization in terms of security, compatibility, and performance, reflecting both conceptual design and practical system-level considerations.

Part 1: Canonical-Argument LRU Cache

Implement an LRU memoization cache for a fixed function signature f(a, b=0, *extra, **kw). Each request is a call described by positional args and keyword args. Two calls must map to the same cache entry if they are semantically the same after Python-style binding for this signature: explicit b=0 equals omitted b, providing a or b by keyword is equivalent to providing them positionally, keyword order does not matter, dict order does not matter, set order does not matter, and nested lists/tuples with the same contents are considered equivalent sequences. Sequence order still matters. The cached function value is defined as the recursive sum of all integers contained in bound a, b, extra, and kw values; dict keys do not contribute to the sum. Use a hash map plus a doubly linked list for O(1) average cache get/put operations. Do not use functools.lru_cache or OrderedDict.

Constraints

  • 0 <= capacity <= 10^4
  • 0 <= len(calls) <= 2 * 10^4
  • Each call is valid for the signature f(a, b=0, *extra, **kw)
  • Argument trees contain only hashable atoms or nested lists, tuples, dicts, and sets; total nested elements across all calls is at most 2 * 10^5

Examples

Input: {'capacity': 2, 'calls': [{'args': [1], 'kwargs': {'b': 2, 'x': [3, 4]}}, {'args': [], 'kwargs': {'x': (3, 4), 'a': 1, 'b': 2}}, {'args': [5], 'kwargs': {}}, {'args': [6], 'kwargs': {}}, {'args': [1], 'kwargs': {'x': (3, 4), 'b': 2}}]}

Expected Output: {'results': [10, 10, 5, 6, 10], 'hit_miss': [False, True, False, False, False], 'exec_count': 4}

Explanation: The first two calls are semantically identical, so the second is a hit. Later, capacity 2 forces eviction, so the final repeated call is a miss.

Input: {'capacity': 0, 'calls': [{'args': [1], 'kwargs': {}}, {'args': [], 'kwargs': {'a': 1, 'b': 0}}]}

Expected Output: {'results': [1, 1], 'hit_miss': [False, False], 'exec_count': 2}

Explanation: Edge case: with capacity 0, nothing is stored, so every call misses even if the keys are equivalent.

Input: {'capacity': 1, 'calls': [{'args': [1], 'kwargs': {'meta': {'y': 2, 'x': 3}, 'tags': {5, 4}}}, {'args': [], 'kwargs': {'a': 1, 'tags': {4, 5}, 'meta': {'x': 3, 'y': 2}}}]}

Expected Output: {'results': [15, 15], 'hit_miss': [False, True], 'exec_count': 1}

Explanation: Nested dict order and set order should not change the cache key.

Input: {'capacity': 3, 'calls': []}

Expected Output: {'results': [], 'hit_miss': [], 'exec_count': 0}

Explanation: Edge case: no calls.

Solution

def solution(data):
    capacity = data['capacity']
    calls = data['calls']
    class Node:
        __slots__ = ('key', 'value', 'prev', 'next')
        def __init__(self, key=None, value=None):
            self.key = key
            self.value = value
            self.prev = None
            self.next = None
    def freeze(x):
        if isinstance(x, dict):
            items = [(freeze(k), freeze(v)) for k, v in x.items()]
            items.sort(key=lambda pair: repr(pair[0]))
            return ('dict', tuple(items))
        if isinstance(x, set):
            items = [freeze(v) for v in x]
            items.sort(key=repr)
            return ('set', tuple(items))
        if isinstance(x, (list, tuple)):
            return ('seq', tuple(freeze(v) for v in x))
        return ('atom', x)
    def sum_ints(x):
        if type(x) is int:
            return x
        if isinstance(x, dict):
            return sum(sum_ints(v) for v in x.values())
        if isinstance(x, (list, tuple, set)):
            return sum(sum_ints(v) for v in x)
        return 0
    def bind(call):
        args = list(call.get('args', []))
        kwargs = dict(call.get('kwargs', {}))
        if args:
            a = args[0]
            kwargs.pop('a', None)
        else:
            a = kwargs.pop('a')
        if len(args) >= 2:
            b = args[1]
            kwargs.pop('b', None)
        else:
            b = kwargs.pop('b', 0)
        extra = tuple(args[2:]) if len(args) > 2 else ()
        return a, b, extra, kwargs
    head = Node()
    tail = Node()
    head.next = tail
    tail.prev = head
    cache = {}
    def remove(node):
        node.prev.next = node.next
        node.next.prev = node.prev
    def append_mru(node):
        prev = tail.prev
        prev.next = node
        node.prev = prev
        node.next = tail
        tail.prev = node
    results = []
    hit_miss = []
    exec_count = 0
    for call in calls:
        a, b, extra, kw = bind(call)
        key = (freeze(a), freeze(b), freeze(extra), freeze(kw))
        node = cache.get(key)
        if node is not None:
            remove(node)
            append_mru(node)
            hit_miss.append(True)
            results.append(node.value)
            continue
        value = sum_ints(a) + sum_ints(b) + sum_ints(extra) + sum_ints(kw)
        exec_count += 1
        hit_miss.append(False)
        results.append(value)
        if capacity == 0:
            continue
        node = Node(key, value)
        cache[key] = node
        append_mru(node)
        if len(cache) > capacity:
            lru = head.next
            remove(lru)
            del cache[lru.key]
    return {'results': results, 'hit_miss': hit_miss, 'exec_count': exec_count}

Time complexity: Per call: O(k) for binding/canonicalization/scoring of a call with k nested atomic elements, plus O(1) average-time LRU cache updates.. Space complexity: O(capacity * average_key_size) for the cache, excluding the input..

Hints

  1. First bind every call to the canonical form (a, b, extra, kw) and apply the default b=0 before building a cache key.
  2. Recursively convert unhashable objects into immutable tagged tuples so that dict key order and set order stop mattering.

Part 2: Persistent LRU Cache with Versioned JSON Snapshots

Implement an LRU cache with snapshot and restore. Keys are strings and values are JSON-compatible data. Support put, get, snapshot, and restore operations. A snapshot must use this exact logical schema: {'version': 1, 'capacity': <int>, 'items': [{'key': <string>, 'value': <json>} ...]} where items are listed from least recently used to most recently used. Serialize snapshots deterministically with json.dumps(..., sort_keys=True, separators=(',', ':')). On restore, reject malformed JSON, unsupported versions, duplicate keys, or snapshots with more items than capacity; in all such cases, leave the current cache unchanged and return False. In a real file-backed system, the safe atomic write strategy would be: write to a temporary file, fsync, then rename/replace. For this problem, snapshot and restore work with strings instead of real files. Use JSON rather than pickle: pickle is flexible and often faster for arbitrary Python objects, but unsafe for untrusted data and less portable across environments; JSON is safer and more interoperable when you define an explicit schema.

Constraints

  • 0 <= capacity <= 10^4
  • 0 <= len(ops) <= 2 * 10^4
  • Keys used by put/get are strings
  • Values stored by put are JSON-compatible (null/boolean/number/string/list/object)
  • If restore receives an invalid snapshot, the cache state must remain unchanged

Examples

Input: {'capacity': 2, 'ops': [['put', 'a', 1], ['put', 'b', 2], ['get', 'a'], ['snapshot'], ['put', 'c', 3], ['restore', '{"capacity":2,"items":[{"key":"b","value":2},{"key":"a","value":1}],"version":1}'], ['get', 'b']]}

Expected Output: {'outputs': [None, None, {'found': True, 'value': 1}, '{"capacity":2,"items":[{"key":"b","value":2},{"key":"a","value":1}],"version":1}', None, True, {'found': True, 'value': 2}]}

Explanation: After get('a'), the LRU order becomes b then a, so the snapshot stores items in that order. Restoring brings b back after it was evicted by putting c.

Input: {'capacity': 1, 'ops': [['put', 'x', {'n': 1}], ['snapshot'], ['restore', '{"version":2,"capacity":1,"items":[]}'], ['get', 'x'], ['restore', 'not json'], ['get', 'y']]}

Expected Output: {'outputs': [None, '{"capacity":1,"items":[{"key":"x","value":{"n":1}}],"version":1}', False, {'found': True, 'value': {'n': 1}}, False, {'found': False}]}

Explanation: Unsupported version and malformed JSON must both fail without changing the existing cache.

Input: {'capacity': 2, 'ops': [['put', 'k', 1], ['snapshot'], ['put', 'm', 2], ['put', 'k', 9], ['snapshot'], ['restore', '{"capacity":2,"items":[{"key":"k","value":1}],"version":1}'], ['get', 'm'], ['get', 'k']]}

Expected Output: {'outputs': [None, '{"capacity":2,"items":[{"key":"k","value":1}],"version":1}', None, None, '{"capacity":2,"items":[{"key":"m","value":2},{"key":"k","value":9}],"version":1}', True, {'found': False}, {'found': True, 'value': 1}]}

Explanation: Updating k changes both its value and recency. Restoring the older snapshot brings back the older single-item state.

Input: {'capacity': 0, 'ops': [['put', 'a', 1], ['snapshot'], ['restore', '{"capacity":0,"items":[],"version":1}'], ['get', 'a']]}

Expected Output: {'outputs': [None, '{"capacity":0,"items":[],"version":1}', True, {'found': False}]}

Explanation: Edge case: capacity 0 means no entries are retained, but empty snapshots are still valid.

Solution

def solution(data):
    import json
    class Node:
        __slots__ = ('key', 'value', 'prev', 'next')
        def __init__(self, key=None, value=None):
            self.key = key
            self.value = value
            self.prev = None
            self.next = None
    class LRU:
        def __init__(self, capacity):
            self.capacity = capacity
            self._reset()
        def _reset(self):
            self.map = {}
            self.head = Node()
            self.tail = Node()
            self.head.next = self.tail
            self.tail.prev = self.head
        def _remove(self, node):
            node.prev.next = node.next
            node.next.prev = node.prev
        def _append(self, node):
            prev = self.tail.prev
            prev.next = node
            node.prev = prev
            node.next = self.tail
            self.tail.prev = node
        def get(self, key):
            node = self.map.get(key)
            if node is None:
                return {'found': False}
            self._remove(node)
            self._append(node)
            return {'found': True, 'value': node.value}
        def put(self, key, value):
            node = self.map.get(key)
            if node is not None:
                node.value = value
                self._remove(node)
                self._append(node)
                return
            if self.capacity == 0:
                return
            node = Node(key, value)
            self.map[key] = node
            self._append(node)
            if len(self.map) > self.capacity:
                lru = self.head.next
                self._remove(lru)
                del self.map[lru.key]
        def snapshot(self):
            items = []
            cur = self.head.next
            while cur is not self.tail:
                items.append({'key': cur.key, 'value': cur.value})
                cur = cur.next
            payload = {'version': 1, 'capacity': self.capacity, 'items': items}
            return json.dumps(payload, sort_keys=True, separators=(',', ':'))
        def restore(self, text):
            try:
                payload = json.loads(text)
            except Exception:
                return False
            if not isinstance(payload, dict):
                return False
            if set(payload.keys()) != {'version', 'capacity', 'items'}:
                return False
            version = payload.get('version')
            cap = payload.get('capacity')
            items = payload.get('items')
            if type(version) is not int or version != 1:
                return False
            if type(cap) is not int or cap < 0 or not isinstance(items, list):
                return False
            if len(items) > cap:
                return False
            parsed = []
            seen = set()
            for item in items:
                if not isinstance(item, dict) or set(item.keys()) != {'key', 'value'}:
                    return False
                key = item['key']
                if not isinstance(key, str) or key in seen:
                    return False
                seen.add(key)
                parsed.append((key, item['value']))
            self.capacity = cap
            self._reset()
            for key, value in parsed:
                node = Node(key, value)
                self.map[key] = node
                self._append(node)
            return True
    cache = LRU(data['capacity'])
    outputs = []
    for op in data['ops']:
        kind = op[0]
        if kind == 'put':
            cache.put(op[1], op[2])
            outputs.append(None)
        elif kind == 'get':
            outputs.append(cache.get(op[1]))
        elif kind == 'snapshot':
            outputs.append(cache.snapshot())
        elif kind == 'restore':
            outputs.append(cache.restore(op[1]))
        else:
            raise ValueError('Unknown operation')
    return {'outputs': outputs}

Time complexity: get/put are O(1) average. snapshot and restore are O(n + s), where n is the number of cached items and s is the snapshot string size.. Space complexity: O(capacity + s), where s is the size of the largest snapshot string being processed..

Hints

  1. Store cache items in LRU order so snapshot can be emitted directly from least recent to most recent.
  2. Validate the entire snapshot before mutating the cache. In production, pair this with temp-file plus rename for atomic persistence.
Last updated: Apr 23, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • Careers
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Convert Samples into Event Intervals - Anthropic (medium)
  • Convert State Stream to Events - Anthropic (medium)
  • Build a concurrent web crawler - Anthropic (medium)
  • Implement a Parallel Image Processor - Anthropic (medium)
  • Implement a Batch Image Processor - Anthropic (medium)