In-Memory Database: Progressive Design in 4 Levels
You are designing a simplified in-memory database to be implemented in four progressive levels. Each level unlocks after passing tests for the current one. The system is in-memory only; you may assume a single-process environment. Keep the design clear and incremental, with well-defined APIs, data structures, and semantics.
Assumptions to Make Explicit
-
Keys: records are identified by a unique string record_id.
-
Fields: each record is a map of field_name (string) to value. Values may be typed (e.g., string, number, boolean). Define type handling rules for comparisons and filtering.
-
Time: use a monotonic millisecond clock abstraction for deterministic tests. Expose a Clock interface; production can use wall-clock.
-
Concurrency: assume single-threaded by default; optionally discuss multi-thread guards.
Level 1: CRUD
Implement basic operations over records, fields, and values. Define precise behavior for non-existent records/fields and idempotency.
Required APIs:
-
create_record(record_id) -> bool created
-
delete_record(record_id) -> bool deleted
-
upsert_field(record_id, field, value) -> void (state after call reflects value). Specify whether this auto-creates a record if missing.
-
delete_field(record_id, field) -> bool deleted
-
read_record(record_id) -> map[field -> value] or NotFound
-
read_field(record_id, field) -> value or NotFound
Define:
-
Behavior when operating on non-existent records/fields.
-
Idempotency guarantees for create, delete, upsert, and field deletes.
Deliverables per level:
-
API signatures
-
Core data structures and algorithms
-
Complexity analysis
-
Key edge cases
-
Minimal test plan
Level 2: Filtered Field View
Add the ability to list fields of a given record that satisfy filter predicates.
Required API:
-
list_fields(record_id, filter_expr) -> list[field_name]
Define:
-
Filter expression format that supports equality/inequality, numeric comparisons, and substring/regex on strings. Include logical AND/OR/NOT.
-
Type and error semantics (e.g., comparing strings to numbers, invalid regex).
Level 3: Per-Record TTL
Support setting a Time-To-Live per record so that a record automatically expires after its TTL elapses.
Required APIs:
-
set_ttl(record_id, ttl_ms) -> void
-
get_ttl(record_id) -> expire_at_ms or None
Define:
-
Time source abstraction.
-
Expiry checks and cleanup strategy (lazy vs background sweeper) targeting efficient expiry (O(log n) per update or better).
-
Semantics for reads/writes on expired records.
Level 4: Look-back Queries (Time Travel)
Support querying values as of a past timestamp.
Required APIs:
-
read_record_as_of(record_id, ts_ms) -> map[field -> value] or NotFound
-
read_field_as_of(record_id, field, ts_ms) -> value or NotFound
Define:
-
How creates, updates, field deletes, record deletes, and TTL interact with historical reads.
-
Storage approach (e.g., per-field versioning, append-only log, optional snapshots) and API design.
-
Trade-offs in time and space; retention policy if any.
For each level, provide:
-
API signatures
-
Data structures and algorithms
-
Complexity analysis
-
Key edge cases
-
Minimal test plan
Optionally discuss concurrency assumptions (single-threaded vs multi-threaded) and persistence scope (in-memory only).