Implement Persistent Memoization LRU Cache
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This question evaluates knowledge of memoization, LRU cache semantics, deterministic cache-key construction from positional and keyword arguments, JSON serialization, and efficient on-disk persistence with recoverable ordering.
Part 1: Canonical Cache Key Generation
Constraints
- The input string is valid JSON.
- All values inside args and kwargs are JSON-serializable: null, boolean, number, string, array, or object.
- 1 <= len(func_name) <= 200
- Total input size is at most 10^5 characters.
Examples
Input: '{"func_name":"f","args":[],"kwargs":{}}'
Expected Output: '["f",[],{}]'
Explanation: Edge case with no positional or keyword arguments.
Input: '{"func_name":"f","args":[],"kwargs":{"b":2,"a":1}}'
Expected Output: '["f",[],{"a":1,"b":2}]'
Explanation: Keyword arguments are reordered canonically.
Input: '{"func_name":"g","args":[1,{"y":true,"x":null},["z",3]],"kwargs":{"q":{"b":[2,3],"a":1}}}'
Expected Output: '["g",[1,{"x":null,"y":true},["z",3]],{"q":{"a":1,"b":[2,3]}}]'
Explanation: Nested objects are normalized too.
Input: '{"func_name":"h","args":[true,1,false,0],"kwargs":{"nested":{"b":{"d":4,"c":3},"a":2}}}'
Expected Output: '["h",[true,1,false,0],{"nested":{"a":2,"b":{"c":3,"d":4}}}]'
Explanation: Booleans, numbers, and nested dictionaries are preserved in a deterministic form.
Hints
- A string is already hashable in Python, so a deterministic string can itself be the cache key.
- Think about a serializer that sorts object keys recursively so {'a':1,'b':2} and {'b':2,'a':1} become identical.
Part 2: Persistent LRU Cache Recovery and Append-Only Logging
Constraints
- 0 <= capacity <= 10^4
- 0 <= len(existing_log) + len(operations) <= 2 * 10^4
- All keys and values are JSON-serializable.
- A non-object element in existing_log represents a malformed/truncated final record and ends recovery immediately.
Examples
Input: '{"capacity":2,"existing_log":[{"op":"put","key":"a","value":1},{"op":"put","key":"b","value":2}],"operations":[{"op":"get","key":"a"},{"op":"put","key":"c","value":3},{"op":"get","key":"b"}]}'
Expected Output: '{"final_cache":[["a",1],["c",3]],"get_results":[1,null],"updated_log":[{"key":"a","op":"put","value":1},{"key":"b","op":"put","value":2},{"hit":true,"key":"a","op":"get"},{"key":"c","op":"put","value":3},{"hit":false,"key":"b","op":"get"}]}'
Explanation: After recovering a,b, accessing a makes b the LRU. Inserting c evicts b. The final get on b is a miss but still appends a log record.
Input: '{"capacity":2,"existing_log":[{"op":"put","key":"x","value":1},"MALFORMED"],"operations":[{"op":"get","key":"x"},{"op":"put","key":"z","value":3}]}'
Expected Output: '{"final_cache":[["x",1],["z",3]],"get_results":[1],"updated_log":[{"key":"x","op":"put","value":1},{"hit":true,"key":"x","op":"get"},{"key":"z","op":"put","value":3}]}'
Explanation: The malformed tail is ignored instead of breaking recovery.
Input: '{"capacity":2,"existing_log":[{"op":"put","key":{"b":2,"a":1},"value":"first"},{"op":"put","key":["x",{"d":4,"c":3}],"value":"second"},{"op":"get","key":{"a":1,"b":2},"hit":true}],"operations":[{"op":"get","key":["x",{"c":3,"d":4}]},{"op":"put","key":"new","value":99},{"op":"get","key":{"b":2,"a":1}}]}'
Expected Output: '{"final_cache":[[["x",{"c":3,"d":4}],"second"],["new",99]],"get_results":["second",null],"updated_log":[{"key":{"a":1,"b":2},"op":"put","value":"first"},{"key":["x",{"c":3,"d":4}],"op":"put","value":"second"},{"hit":true,"key":{"a":1,"b":2},"op":"get"},{"hit":true,"key":["x",{"c":3,"d":4}],"op":"get"},{"key":"new","op":"put","value":99},{"hit":false,"key":{"a":1,"b":2},"op":"get"}]}'
Explanation: Object keys are normalized, so equivalent JSON objects match even when their field order differs.
Input: '{"capacity":0,"existing_log":[],"operations":[{"op":"put","key":"a","value":1},{"op":"get","key":"a"}]}'
Expected Output: '{"final_cache":[],"get_results":[null],"updated_log":[{"key":"a","op":"put","value":1},{"hit":false,"key":"a","op":"get"}]}'
Explanation: Boundary case: a zero-capacity cache stores nothing, but every access is still logged.
Hints
- Replay the valid log from left to right. A recovered put updates value and MRU order; a recovered get with hit=true only moves the key to MRU if it currently exists.
- Use one normalized representation of a JSON key for hashing in memory, and another JSON-safe representation for output.