PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates knowledge of memoization, LRU cache semantics, deterministic cache-key construction from positional and keyword arguments, JSON serialization, and efficient on-disk persistence with recoverable ordering.

  • hard
  • Anthropic
  • Coding & Algorithms
  • Software Engineer

Implement Persistent Memoization LRU Cache

Company: Anthropic

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Onsite

You are given a code skeleton for a memoization cache that wraps functions like a decorator. The cache is already implemented as an LRU cache using an ordered map: it has a fixed capacity, evicts the least recently used entry, and moves accessed entries to the most-recently-used position. All function arguments and return values are guaranteed to be JSON-serializable. Complete the missing pieces. ### Part 1: Implement cache-key generation Implement: ```python def _cache_key(self, func_name: str, args: list, kwargs: dict): pass ``` Requirements: - `func_name` is guaranteed to uniquely identify the wrapped function. - The returned key must be usable as a dictionary key. - Positional arguments and keyword arguments must both be included in the key. - Calls that are semantically equivalent must map to the same cache key. For example, these two calls should produce the same key: ```python f(a=1, b=2) f(b=2, a=1) ``` ### Part 2: Add disk persistence Extend the cache constructor with an optional parameter: ```python def __init__(self, capacity: int, file_path: str = ""): pass ``` If `file_path` is the empty string, persistence is disabled. Otherwise: - When the cache is initialized, it should restore previous cache contents from the file. - Every cache access, including both cache hits and cache misses, should update the persisted state. - A single persistence write must not take time linear in the current cache size. In other words, do not rewrite or dump the entire cache on every access. - The persisted format must be able to reconstruct both values and LRU ordering. - Repeated accesses to the same key must correctly update its position to most recently used after recovery. - The implementation should handle JSON serialization of cache keys even if the in-memory key uses tuples or other hashable structures derived from JSON data. - If the persistence file ends with a truncated or malformed final record, recovery should tolerate it rather than failing the entire load. Discuss any optional improvements, such as log compaction or reducing disk I/O on cache hits.

Quick Answer: This question evaluates knowledge of memoization, LRU cache semantics, deterministic cache-key construction from positional and keyword arguments, JSON serialization, and efficient on-disk persistence with recoverable ordering.

Part 1: Canonical Cache Key Generation

You are given one function call encoded as a JSON string. The JSON object contains three fields: func_name, args, and kwargs. Return a canonical cache-key string that can be used as a dictionary key. Positional arguments must keep their order, but keyword arguments with the same content in different orders must produce the same key. Nested JSON objects should also be serialized deterministically, so object key order anywhere inside the input does not change the final key.

Constraints

  • The input string is valid JSON.
  • All values inside args and kwargs are JSON-serializable: null, boolean, number, string, array, or object.
  • 1 <= len(func_name) <= 200
  • Total input size is at most 10^5 characters.

Examples

Input: '{"func_name":"f","args":[],"kwargs":{}}'

Expected Output: '["f",[],{}]'

Explanation: Edge case with no positional or keyword arguments.

Input: '{"func_name":"f","args":[],"kwargs":{"b":2,"a":1}}'

Expected Output: '["f",[],{"a":1,"b":2}]'

Explanation: Keyword arguments are reordered canonically.

Input: '{"func_name":"g","args":[1,{"y":true,"x":null},["z",3]],"kwargs":{"q":{"b":[2,3],"a":1}}}'

Expected Output: '["g",[1,{"x":null,"y":true},["z",3]],{"q":{"a":1,"b":[2,3]}}]'

Explanation: Nested objects are normalized too.

Input: '{"func_name":"h","args":[true,1,false,0],"kwargs":{"nested":{"b":{"d":4,"c":3},"a":2}}}'

Expected Output: '["h",[true,1,false,0],{"nested":{"a":2,"b":{"c":3,"d":4}}}]'

Explanation: Booleans, numbers, and nested dictionaries are preserved in a deterministic form.

Hints

  1. A string is already hashable in Python, so a deterministic string can itself be the cache key.
  2. Think about a serializer that sorts object keys recursively so {'a':1,'b':2} and {'b':2,'a':1} become identical.

Part 2: Persistent LRU Cache Recovery and Append-Only Logging

Simulate a persistent LRU cache using an append-only access log. To avoid real file I/O, the log is represented as parsed JSON records inside one JSON input string. Recover the cache from the valid prefix of existing_log, then process new operations. Every access must append exactly one new record to the log, including cache hits and cache misses. The cache must preserve least-recently-used order. Keys and values are JSON-serializable, and keys may be arrays or objects, so you must normalize them for both hashing and deterministic output. If existing_log contains a malformed final record, it is represented by any non-object entry (the tests use the string "MALFORMED"); ignore that entry and everything after it. Follow-up discussion only: in a real system, you could add periodic log compaction or batch some writes to reduce disk I/O.

Constraints

  • 0 <= capacity <= 10^4
  • 0 <= len(existing_log) + len(operations) <= 2 * 10^4
  • All keys and values are JSON-serializable.
  • A non-object element in existing_log represents a malformed/truncated final record and ends recovery immediately.

Examples

Input: '{"capacity":2,"existing_log":[{"op":"put","key":"a","value":1},{"op":"put","key":"b","value":2}],"operations":[{"op":"get","key":"a"},{"op":"put","key":"c","value":3},{"op":"get","key":"b"}]}'

Expected Output: '{"final_cache":[["a",1],["c",3]],"get_results":[1,null],"updated_log":[{"key":"a","op":"put","value":1},{"key":"b","op":"put","value":2},{"hit":true,"key":"a","op":"get"},{"key":"c","op":"put","value":3},{"hit":false,"key":"b","op":"get"}]}'

Explanation: After recovering a,b, accessing a makes b the LRU. Inserting c evicts b. The final get on b is a miss but still appends a log record.

Input: '{"capacity":2,"existing_log":[{"op":"put","key":"x","value":1},"MALFORMED"],"operations":[{"op":"get","key":"x"},{"op":"put","key":"z","value":3}]}'

Expected Output: '{"final_cache":[["x",1],["z",3]],"get_results":[1],"updated_log":[{"key":"x","op":"put","value":1},{"hit":true,"key":"x","op":"get"},{"key":"z","op":"put","value":3}]}'

Explanation: The malformed tail is ignored instead of breaking recovery.

Input: '{"capacity":2,"existing_log":[{"op":"put","key":{"b":2,"a":1},"value":"first"},{"op":"put","key":["x",{"d":4,"c":3}],"value":"second"},{"op":"get","key":{"a":1,"b":2},"hit":true}],"operations":[{"op":"get","key":["x",{"c":3,"d":4}]},{"op":"put","key":"new","value":99},{"op":"get","key":{"b":2,"a":1}}]}'

Expected Output: '{"final_cache":[[["x",{"c":3,"d":4}],"second"],["new",99]],"get_results":["second",null],"updated_log":[{"key":{"a":1,"b":2},"op":"put","value":"first"},{"key":["x",{"c":3,"d":4}],"op":"put","value":"second"},{"hit":true,"key":{"a":1,"b":2},"op":"get"},{"hit":true,"key":["x",{"c":3,"d":4}],"op":"get"},{"key":"new","op":"put","value":99},{"hit":false,"key":{"a":1,"b":2},"op":"get"}]}'

Explanation: Object keys are normalized, so equivalent JSON objects match even when their field order differs.

Input: '{"capacity":0,"existing_log":[],"operations":[{"op":"put","key":"a","value":1},{"op":"get","key":"a"}]}'

Expected Output: '{"final_cache":[],"get_results":[null],"updated_log":[{"key":"a","op":"put","value":1},{"hit":false,"key":"a","op":"get"}]}'

Explanation: Boundary case: a zero-capacity cache stores nothing, but every access is still logged.

Hints

  1. Replay the valid log from left to right. A recovered put updates value and MRU order; a recovered get with hit=true only moves the key to MRU if it currently exists.
  2. Use one normalized representation of a JSON key for hashing in memory, and another JSON-safe representation for output.
Last updated: Jun 6, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Fix a Corrupted Bootloader Instruction - Anthropic (medium)
  • Implement a Time-Aware Task Manager - Anthropic (hard)
  • Implement a Simplified DNS Resolver - Anthropic (hard)
  • Implement Task Management and Duplicate Detection - Anthropic (medium)
  • Implement a hierarchical file store - Anthropic (hard)