Implement an in-memory DB with TTL backup/restore
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Take-home Project
## Problem
Design an **in-memory database** that stores values in a nested structure:
- `key` (string) → `field` (string) → `value` (string)
- Each `(key, field)` entry may optionally have a **TTL** (time-to-live) measured in the same time units as the provided timestamps.
- All operations are evaluated **at a given integer timestamp** `ts`.
- If an entry is expired at time `ts`, it must behave as if it does not exist.
You are given a sequence of queries. Each query is one of the operations below. Implement the database so that each query that produces an output returns the correct string.
### Supported operations
Assume all timestamps are non-decreasing across the query list.
1. **Set without TTL**
- `SET ts key field value`
- Stores `value` at `(key, field)` with **no expiration**.
2. **Set with TTL**
- `SET_TTL ts key field value ttl`
- Stores `value` at `(key, field)` that is valid on the interval `[ts, ts + ttl)`.
3. **Get**
- `GET ts key field`
- Returns the stored value, or an empty string `""` if missing/expired.
4. **Delete**
- `DELETE ts key field`
- Deletes the field if present (and not expired at `ts`).
- Returns `"true"` if something was deleted, else `"false"`.
5. **Scan all fields of a key**
- `SCAN ts key`
- Returns all **non-expired** fields under `key` formatted as:
- `field1(value1),field2(value2),...`
- Fields must be sorted lexicographically by `field`.
- If no fields exist, return an empty string `""`.
6. **Scan fields by prefix**
- `SCAN_PREFIX ts key prefix`
- Same output format as `SCAN`, but include only fields whose names start with `prefix`.
7. **Backup**
- `BACKUP ts`
- Creates a **snapshot** of the entire database state **at time `ts`** (including TTL state).
- Conceptually, TTL countdown is **paused** in the snapshot: when restoring later, each TTL-based entry should resume with the **remaining TTL it had at backup time**.
- (If you need a return value, return the number of top-level keys present in the snapshot; otherwise no output.)
8. **Restore**
- `RESTORE ts at_timestamp`
- Restores the database to the **most recent snapshot** whose backup time `t_backup` satisfies `t_backup <= at_timestamp`.
- After restore at current time `ts`, TTL entries must have their expiration recomputed using the remaining TTL at backup time:
- If an entry had expiration `t_exp` originally, then at backup time `t_backup` it had remaining TTL `rem = t_exp - t_backup`.
- After restoring at `ts`, its new expiration becomes `ts + rem`.
- Entries that were already expired at `t_backup` must not appear in the restored state.
- (If you need a return value, return `"true"` if a snapshot was found and restored, else `"false"`.)
## Task
Implement a function/class to process the queries in order and return the list of outputs (strings) for the queries that produce outputs (`GET`, `DELETE`, `SCAN`, `SCAN_PREFIX`, and any required outputs for `BACKUP`/`RESTORE` depending on your chosen interface).
## Constraints (typical)
- Up to ~10^5 queries
- Keys/fields/values are non-empty strings of reasonable length
- Timestamps and TTLs fit in 32-bit signed integers
- Query timestamps are non-decreasing
Quick Answer: This question evaluates implementation and reasoning skills for an in-memory nested key→field→value store with TTL-based expirations, snapshot backup/restore semantics, and timestamped operations, testing competencies in data structures, state management, and time-aware logic within the Coding & Algorithms domain.
Design an in-memory database that stores string values in a nested structure: key -> field -> value. A field may optionally have a TTL. Every query is evaluated at its integer timestamp ts, and any entry expired at ts must behave as if it does not exist. Process a list of whitespace-separated query strings in order and return the outputs of all queries that produce one.
Supported query formats:
SET ts key field value
SET_TTL ts key field value ttl
GET ts key field
DELETE ts key field
SCAN ts key
SCAN_PREFIX ts key prefix
BACKUP ts
RESTORE ts at_timestamp
Rules:
- SET stores a value with no expiration.
- SET_TTL stores a value valid on [ts, ts + ttl).
- GET returns the value or an empty string if missing/expired.
- DELETE returns "true" if a live field was deleted, otherwise "false".
- SCAN returns all live fields for a key as field(value) pairs joined by commas and sorted lexicographically by field.
- SCAN_PREFIX is the same, but only includes fields whose names start with prefix.
- BACKUP creates a snapshot of all live data at time ts and returns the number of top-level keys in that snapshot as a string.
- RESTORE restores the most recent snapshot whose backup time is <= at_timestamp. TTL countdown is paused inside a backup, so TTL entries must resume with the remaining TTL they had at backup time. Return "true" if such a snapshot exists, otherwise "false".
- Backups remain available after restore.
- Query timestamps are non-decreasing.
Constraints
- 0 <= len(queries) <= 10^5
- Timestamps are non-decreasing across the query list
- Keys, fields, values, and prefixes are non-empty strings without spaces
- TTL values are positive integers that fit in 32-bit signed range
Examples
Input: ["SET 1 user name alice", "GET 1 user name", "SET_TTL 2 user age 30 3", "GET 4 user age", "GET 5 user age", "SCAN 5 user"]
Expected Output: ["alice", "30", "", "name(alice)"]
Explanation: The age field is valid at time 4 but expired at time 5, so only name remains in the scan.
Input: ["SET 1 app aa 1", "SET 1 app ab 2", "SET 1 app b 3", "SCAN_PREFIX 1 app a", "DELETE 2 app aa", "DELETE 2 app aa", "SCAN 2 app"]
Expected Output: ["aa(1),ab(2)", "true", "false", "ab(2),b(3)"]
Explanation: Prefix scan returns the two a* fields in lexicographic order. Deleting aa succeeds once and then fails because it is already gone.
Input: ["SET 1 a x v1", "SET_TTL 2 a y v2 5", "BACKUP 4", "SET 5 a x changed", "GET 6 a y", "RESTORE 8 4", "GET 9 a x", "GET 10 a y", "GET 11 a y"]
Expected Output: ["1", "v2", "true", "v1", "v2", ""]
Explanation: At backup time 4, y has 3 units of TTL remaining. After restoring at time 8, y expires at time 11 and x returns to v1.
Input: ["RESTORE 1 0", "SET 2 k f v", "BACKUP 3", "SET 4 k g w", "BACKUP 5", "DELETE 6 k f", "RESTORE 7 4", "SCAN 7 k", "RESTORE 8 100", "SCAN 8 k"]
Expected Output: ["false", "1", "1", "true", "true", "f(v)", "true", "f(v),g(w)"]
Explanation: The first restore fails because no backup exists yet. Restoring with at_timestamp=4 picks the backup at time 3, while restoring with at_timestamp=100 picks the latest backup at time 5.
Input: []
Expected Output: []
Explanation: No queries means no outputs.
Hints
- Store TTL fields using an absolute expiration timestamp during normal processing, but convert that to remaining TTL when creating a backup.
- Keep backups ordered by backup time so RESTORE can find the latest snapshot with backup_time <= at_timestamp using binary search.