Design in-memory DB with TTL and backup
Company: Circle
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Take-home Project
Design an in-memory key->(field->value) database that supports versioning and TTL. Implement the following timestamped APIs, where t is a logical timestamp and ttl is in the same units, with ttl=null meaning no expiry:
1) set(key, field, value, t, ttl): Upsert a field under key. Maintain append-only history so each (key, field) has validity intervals [start, end) with an optional tombstone for deletes.
2) get(key, field, t): Return the value only if it is not expired at time t (i.e., t < start+ttl when ttl is set, and end is unset or t < end). Otherwise return null.
3) delete(key, field, t): Log a tombstone entry that ends the current interval but keeps history for future replay/restore.
4) scan(key, t): Return all (field, value) pairs for key that are not expired at time t, ordered lexicographically by field.
5) scanPrefix(key, prefix, t): Return only the non-expired (field, value) pairs whose field starts with prefix, ordered lexicographically by field.
6) backup(t): Record a snapshot marker at time t and return the total count of non-expired (key, field) pairs at time t across the whole DB.
7) restore(t): Rewind the observable state to the most recent backup at or before time t and return the count that backup had returned. Deletions must be represented via tombstones so history remains replayable.
Requirements and clarifications:
- The TTL applies at the (key, field) level, not at the key level. Reads and scans must exclude expired entries. Define precise behavior when updating an already-expired field (treat as a fresh insert).
- Scans must be deterministically ordered. Propose data structures that support efficient ordered scans and prefix scans (e.g., per-key ordered index) while coexisting with TTL and history.
- Describe how you will track [start, end), value, ttl, and delete tombstones per (key, field) to support time-travel semantics, backup/restore, and efficient reads.
- Provide expected time and space complexities for each API. Discuss trade-offs between per-key ordered maps vs. brute-force plus sort, and global vs. per-key expiry indexes.
- Explain how you would garbage-collect fully expired history while keeping enough information to support restore to a prior backup.
Quick Answer: This question evaluates ability to design a versioned in-memory key→(field→value) store with per-field TTL, append-only history and tombstones, snapshot backup/restore, ordered and prefix scans, and trade-offs in indexing and garbage collection, focusing on data modeling, visibility semantics, and performance/space reasoning.